Gendered Moral Distortions in LLMs

Large language models can display inconsistent moral priorities tied to gendered framing (for example, judging harassment of women as less permissible than more severe harms like torture), indicating they’re generalizing discourse patterns rather than reasoning about harm. This pattern appears linked to the models’ training on public debates about gender equality, producing systematic but counterintuitive outputs. — If true, these distortions matter for AI deployment in ethics‑sensitive domains (law, policing, content moderation) because models may amplify or invert social justice narratives unpredictably.

Sources

AIs Are Dumb and Sexist

Steve Stewart-Williams 2026.03.31 100% relevant

Fulgu and Capraro (2026) experiments reported that GPT‑4 judged abusing a man to avert apocalypse more acceptable than abusing a woman, and answered inconsistently on torture versus harassment scenarios.