Evaluating the ability of large language models to predict human social decisions

Two studies compared GPT-3.5, GPT-4, and GPT-4o against human decisions across social scenarios, revealing LLMs differ in risk framing and social sensitivity, often misaligning with human patterns.

Researchers evaluated GPT-3.5, GPT-4, and GPT-4o on their ability to predict human social decisions across 51 scenarios (9,600 responses) and additional social-group contexts (1,600 responses).

Results showed notable discrepancies: LLMs were less sensitive to kinship and group size, displayed risk preferences differing from human patterns e.g., GPT-4 was consistently risk-averse and framed decisions in ways humans do not.

These findings highlight both the predictive power and limitations of LLMs in modeling human social behavior.

No items found.

Read Our Content

See All Blogs

How Hiswai built an AI report generation software with GoML to cut research time by 80%

Deveshi Dabbawala

July 20, 2026

The Complete Guide to ChatGPT-5.6: Sol, Terra and Luna

Sarankumar S

July 14, 2026

Evaluating the ability of large language models to predict human social decisions

Read Our Content

How Hiswai built an AI report generation software with GoML to cut research time by 80%

Deveshi Dabbawala

The Complete Guide to ChatGPT-5.6: Sol, Terra and Luna

Sarankumar S

Accelerate Your AI Adoption

Get an Executive Briefing

HQ

India

Evaluating the ability of large language models to predict human social decisions

Read Our Content

How Hiswai built an AI report generation software with GoML to cut research time by 80%

Deveshi Dabbawala

The Complete Guide to ChatGPT-5.6: Sol, Terra and Luna

Sarankumar S

Accelerate Your AI Adoption

Get an Executive Briefing​

HQ

India​

Get an Executive Briefing

India