The Algorithm Will See You Now: How AI Is Trained to Dismiss Women
When AI inherits medicine’s blind spots, women’s health becomes data collateral, dismissed, delayed, and “managed well” into systemic invisibility.
Series Overview: Trust, Bias, and the Algorithm: Rethinking AI in Women’s Healthcare
This was a series that I wrote back in March. I am including it here because this Guardian article shows that the issues reviewed in the series are, unfortunately, too accurate. For generations, women have learned to manage their healthcare defensively. To bring binders of documentation. To downplay emotion. To preemptively appear “credible.” To steel themselves for disbelief. The result? A pattern of medical neglect that isn’t accidental, it’s structural. Women are more likely to be misdiagnosed. More likely to be prescribed psychiatric drugs for physical symptoms. More likely to wait longer for pain relief. Less likely to be believed.
What the Series Covered:
The Diagnosis Delay – Why women still aren’t believed, and why AI might change that, or make it worse.
Garbage In, Garbage Out – How biased training data reproduces real-world medical harm.
Can We Build a Better Machine? – What equitable AI design in healthcare could look like.
AI You Can Argue With – Why transparency, explainability, and patient input are essential.
Beyond the Algorithm – The cultural and systemic changes needed to make any of this work.
SOURCE: The Guardian — AI tools used by English councils downplay women’s health issues, study finds
When technology automates centuries of medical bias, who pays the price?
In August 2025, when researchers at the London School of Economics published their findings on AI tools used by English councils, the news barely registered beyond specialized healthcare circles. But it should have detonated like a bomb. Over half of England’s local authorities, those responsible for determining who receives critical adult social care, are using artificial intelligence systems that systematically downplay women’s health needs. Not occasionally. Not accidentally. Systematically.
Dr. Sam Rickman’s team analyzed nearly 30,000 pairs of case summaries generated by large language models, each pair describing identical patients with only their gender swapped. The pattern was unmistakable: Google’s “Gemma” model used words like “disabled,” “unable,” and “complex” significantly more often for men. Women with the same conditions? They were “managing well” and “independent,” their struggles softened into euphemism, their needs omitted entirely.
A man was “unable to access the community.” A woman with identical mobility issues was “able to manage her daily activities.”
The difference is linguistic minimization, the kind that shifts resource allocation, triage priority, and ultimately, who gets help and who gets told they’re fine.
This isn’t just bad software. It’s digital gaslighting at scale.
The Algorithmic Echo Chamber
What makes the LSE findings so unsettling isn’t their novelty, but their inevitability. For those who’ve been tracking bias in healthcare AI, this revelation felt less like a shock and more like watching a slow-motion disaster you predicted months ago arrive exactly on schedule.
AI tools are being trained on electronic health records, clinical trial data, and physician notes that reflect centuries of unequal treatment, records built on systems that have consistently misdiagnosed women, ignored pain in Black patients, and filtered symptoms through cultural assumptions. When these biased data sources are fed into algorithms, they don’t correct for injustice; they encode it.
Consider what the medical establishment has taught machines about women’s health:
Women with autoimmune diseases wait an average of 4.5 years longer for a diagnosis than men. Women having heart attacks are 50% more likely to be misdiagnosed. In emergency rooms, women wait up to 16 minutes longer for pain medication. For endometriosis, the average diagnostic delay is 7 to 10 years.
Now imagine an AI trained on those records, where delayed diagnoses appear “normal,” where pain documented as “anxiety” becomes the baseline, where women are already linguistically minimized in the source material. The algorithm isn’t inventing bias, it’s learning from every dismissive consultation note, every psychiatric referral for physical pain, every “it’s probably just stress” scribbled into a chart.
The Historical Baggage We’re Encoding
The problem isn’t just the data; it’s the worldview behind the data. For most of medical history, women weren’t forgotten in medical science; they were deliberately excluded. The NIH didn’t require women to be included in clinical trials until 1993. That’s not even a whole Generation ago!
In ancient Greece, Hippocrates blamed hysteria on a “wandering womb.” In the 19th century, Silas Weir Mitchell’s infamous “rest cure” prescribed bed rest and silence for women with nervous disorders, dismissing their symptoms as emotional weakness. Even mid-20th-century clinical trials excluded women altogether, leading to generations of drugs never tested on half the population they were meant to treat.
As historian and scientist Cat Bohannon documents, the erasure of female biology from science wasn’t accidental; it was architectural. The average anatomy textbook still features male bodies as the default illustration unless the chapter concerns reproduction. This foundational bias echoes today in everything from diagnostic criteria to the algorithms that are supposed to revolutionize care.
And now, we’re training AI on this legacy.
Beyond Google: The Broader Pattern
While Google’s Gemma grabbed headlines, the problem extends far beyond one company’s model. Meta’s Llama 3 showed no significant gender bias when processing the same case notes, suggesting the issue isn’t inherent to all large language models, but is instead a matter of design choices and training data curation.
Recent research reveals the scope of the crisis:
In cardiovascular AI, diagnostic tools trained on predominantly male datasets fail to weight women’s atypical heart attack symptoms—fatigue, nausea, jaw pain—appropriately, leading to missed or delayed diagnoses.
A University of Florida study on AI diagnosis of bacterial vaginosis found that machine learning models showed significant ethnic bias, with accuracy varying dramatically between white, Black, Asian, and Hispanic women.
Research published in the Journal of Medical Internet Research found that GPT-4’s assessment of coronary artery disease risk substantially shifted when psychiatric comorbidities were added to patient vignettes, suddenly assessing women as lower risk than men with identical symptoms.
This isn’t limited to diagnosis. AI algorithms used for healthcare resource allocation have been found to underestimate the health needs of Black patients by using prior healthcare spending as a proxy, assuming more money spent equals more severe illness, while ignoring that Black patients historically receive less care.
The Quiet Crisis Nobody’s Talking About
While the LSE study made some waves, several developments have barely penetrated public consciousness:
The Maternal Mortality Blind Spot: The U.S. has the highest maternal mortality rate among high-income countries, with Black women dying at three times the rate of white women. While AI adoption has shown promise in reducing maternal mortality in developing countries, concerns about algorithmic bias remain largely unaddressed, particularly regarding whether AI systems trained on limited maternal health data can adequately serve diverse populations.
The Endometriosis Erasure: Women with endometriosis already face 7-10 year diagnostic delays. Now, AI systems trained on records in which this delay appears “normal” may perpetuate the pattern, learning that persistent pelvic pain warrants years of investigation rather than urgent intervention.
The Autoimmune Algorithm Gap: Autoimmune diseases disproportionately affect women—nearly 80% of patients—yet these conditions remain systematically underrepresented in AI training datasets. The machine learns what it sees, and what it sees is a medical system that has historically struggled to diagnose and treat conditions that primarily afflict women.
The Wearables Problem: Many consumer health devices—such as fitness trackers and heart rate monitors—are calibrated using male physiology. Average heart rates, body temperatures, and activity patterns: all optimized for men. The result? Potentially inaccurate health data for half the population wearing these devices.
Trust Friction and the Opacity Crisis
What makes the LSE findings particularly alarming is where they’re occurring: local councils using large language models to summarize case notes that inform eligibility for critical services. Yet no one knows which specific models are being deployed, how often, or with what oversight.
This opacity isn’t just bad governance; it’s a failure of proof. In Trust Value Management terms, these systems operate with negative trust equity: decisions affecting vulnerable populations are made using opaque models without renewable evidence of fairness or bias testing.
Most advanced AI models are functionally unexplainable “black boxes.” We know what goes in (symptoms, vitals, lab results) and what comes out (a diagnosis, a risk score), but we don’t really know how the system connects them. In healthcare, this is catastrophic.
When a 2019 hospital risk algorithm was found to underestimate Black patients’ health needs by more than half, the bias had been operating invisibly, unchallengeable because it was opaque. Millions of patients were systematically excluded from preventative programs before anyone noticed.
You can’t consent to care if you don’t understand it. You can’t trust a system you can’t argue with.
The Stakes Are Life and Death
The consequences of biased AI extend far beyond administrative convenience:
Cardiovascular Disease: Women are 50% more likely than men to be misdiagnosed during heart attacks, often because their symptoms—nausea, fatigue, jaw pain—don’t match the “male” pattern. AI systems trained predominantly on male cardiac data won’t fix this; they’ll automate it.
Cancer Screening: Black patients have the highest mortality rate for melanoma, with a 5-year survival rate of only 70% versus 94% for white patients. AI image recognition systems trained primarily on light-skinned individuals exhibit significantly lower accuracy in detecting skin cancer in patients with darker skin.
Autism Diagnosis: For every girl diagnosed with autism, four boys receive a diagnosis. Yet research increasingly shows many girls are missed entirely due to gendered expectations around social behavior. We built diagnostic criteria around boys and then called girls “atypical.” AI trained on these biased diagnostic patterns will continue to miss neurodivergent girls. (I was 32 when I was diagnosed, and it was only recognized after I had pursued a diagnosis for my son.)
Pain Management: A 2016 study found that half of medical students believed myths about Black patients, such as the idea that they have thicker skin or feel less pain. Black patients are 22% less likely than white patients to receive pain medication in emergency rooms. AI systems trained on these prescribing patterns will perpetuate them.
What Needs to Change
AI isn’t inherently biased; it’s intrinsically reflective. That means it can reflect something better, if we dare to build it.
Researchers like Dr. Marzyeh Ghassemi at MIT are pioneering equitable AI design, building models that evaluate performance across subgroups by gender, race, age, and socioeconomic status—not just average accuracy. Dr. Fatima Rodriguez at Stanford has shown how cardiovascular risk prediction tools routinely underestimate risk in women and Black patients, work that’s driving better model design.
But individual researchers can’t fix a systemic problem. We need:
Mandatory Bias Auditing: Every AI system used in healthcare decision-making should undergo rigorous, ongoing testing for demographic bias, with the results publicly available.
Explainable AI: Patients need to see more than a risk score. They need to know what that score is based on, have the right to say “that doesn’t describe me,” and have someone listen.
Diverse Development Teams: Only 5% of active physicians in 2018 identified as Black, about 6% as Hispanic or Latinx, and the percentage of underrepresented AI developers is even lower. We can’t build equitable systems without diverse perspectives in the room.
Representative Data: Training datasets must be intentionally inclusive, with proactive efforts to balance historical exclusions. This means adding data from women with atypical heart attack symptoms, from patients with chronic conditions that have been historically dismissed, from communities systematically underserved.
Patient Power: The right to understand, question, and override AI-driven decisions must be enshrined in healthcare policy. Transparency can’t be optional.
Cultural Change in Medicine: Medical education must teach future clinicians to recognize gender and racial bias not just as “social issues” but as clinical risks. Algorithmic literacy must be integrated into medical training.
The Revolution Will Be Human
Technology won’t rebuild trust on its own. People will.
The transformation won’t come from silicon alone. It will come from the courage to change how we define care, from re-centering the people medicine forgot, from demanding that our most powerful tools reflect our highest values rather than our worst assumptions.
The LSE findings don’t just vindicate warnings; they underline the urgency of transforming “AI fairness” from a research problem into a civic right. Women have always been told to trust the system. The system has rarely earned it.
We stand at a crossroads. AI could become the most powerful diagnostic tool medicine has ever seen, capable of spotting patterns humans miss, expanding access to underserved communities, and challenging biased clinical decisions at the point of care.
Or it could become the most efficient perpetuator of medical injustice in history, faster, more confident, and more impersonal in its dismissal than any human doctor could ever be.
No trust without proof. No proof without transparency. No transparency without accountability.
That’s not just a policy stance. It’s the architecture of a future where women’s health isn’t a statistical afterthought; it’s a trust test the system must pass every single day.
The algorithm is listening. The question is: what are we teaching it to hear?
The stakes have never been higher. The technology is already deployed. And the women it’s supposed to serve are still waiting to be believed.
Go back to the beginning:
Series Overview: “Trust, Bias, and the Algorithm: Rethinking AI in Women’s Healthcare”
Series Overview: “Trust, Bias, and the Algorithm: Rethinking AI in Women’s Healthcare”