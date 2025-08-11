eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

More than half of England’s local councils use artificial intelligence systems that may downplay women’s physical and mental health needs, raising concerns about gender bias in public-sector care, according to new research.

A study by the London School of Economics and Political Science (LSE) found that Google’s AI model Gemma used terms like “disabled,” “unable,” and “complex” significantly more often to summarize case notes for men than women. In contrast, descriptions of similar needs for women were more likely to use softer language or omit details entirely.

“Women are frequently described as managing well ‘despite’ their impairments (with ‘despite’ being a word that appears significantly more for women),’’ the LSE study noted.

Differences across AI models

The LSE research analyzed thousands of gender-swapped versions of long-term care records for older people from a London local authority. In addition to Gemma, researchers used Llama 3 and benchmark models from Meta and Google released in 2019: T5 and BART.

Among the models tested, Gemma showed “the most significant gender-based differences,” with the report explaining that the language used for men was more direct, while women’s needs were more often downplayed. In contrast, Meta’s Llama 3 “showed no gender-based differences across any metrics.”

In one example, a man was described as having “a complex medical history,” while a woman with identical functional ability was described as “living in a townhouse.”

The BART and T5 models also demonstrated smaller, but measurable, differences in sentiment and word choice based on gender.

Do LLMs perpetuate gender stereotypes?

The study’s author, Dr. Sam Rickman, warned that biased AI tools could influence how treatment and support are determined. Local authorities increasingly rely on these tools because LLMs can “produce accurate summaries of healthcare records and even outperform humans,” Rickman said.

However, Rickman noted that a growing body of research has found that LLMs perpetuate gender stereotypes across domains, including in machine translation, among other areas.

Rickman further emphasized that LLMs should not be dismissed outright as tools for easing administrative burdens in health and social care. Instead, he recommended additional studies to determine whether similar patterns occur in other care settings, such as hospitals or mental health services, where documentation styles and service models may differ.

