Health inequity risks from large language models prompt new research and mitigation frameworks

LLMs may amplify healthcare inequities. Researchers propose EquityGuard to reduce bias in clinical AI tasks, showing GPT-4 outperforms others in fairness, especially in underserved and diverse populations.

A new study published in npj Digital Medicine warns that large language models (LLMs), like GPT-4, may inadvertently reinforce healthcare inequities when non-decisive socio-demographic factors such as race, sex, and income are included in clinical inputs.

Researchers introduced EquityGuard, a contrastive learning framework that detects and mitigates bias in medical applications such as Clinical Trial Matching (CTM) and Medical Question Answering (MQA).

Evaluations show GPT-4 demonstrates greater fairness across diverse groups, while other models like Gemini and Claude show notable disparities. EquityGuard improves equity in outputs and is particularly promising for use in low-resource settings where fairness is most critical.

Anthropic