Remote‑Work Automation Overpromise

A new Remote Labor Index test (Scale AI + Center for AI Safety) gave hundreds of real paid freelance tasks to leading AI systems and found the best model fully completed only ~2.5% of assignments, with roughly half producing poor quality or leaving the work incomplete. Failures included corrupt outputs, wrong visual handling, missing data, and brittle memory — concrete limits on current automation capacity. — If replicated, this should temper near‑term job‑elimination narratives, redirect policy toward augmentation, verification standards, and targeted retraining, and shape who bears liability when AI is deployed on real economic tasks.

Sources

AI Fails at Most Remote Work, Researchers Find

EditorDavid 2026.01.10 100% relevant

Remote Labor Index study reported in the Washington Post: models (ChatGPT, Gemini, Claude) succeeded on 2.5% of real freelancing gigs; failures included corrupt files, missing data and visual errors.