Big AI Gains Without Big Compute

Updated: 2025.09.18 1M ago 2 sources
A cited analysis claims GPT‑5 achieved major capability gains with less pretraining compute than the 100× jumps seen from GPT‑2→3→4. If true, scaling laws may be loosening: architecture, data, and training tricks are delivering outsized improvements without proportional compute growth. — This challenges timeline models and energy/planning assumptions that equate progress with massive compute ramps, implying faster‑than‑expected capability diffusion and policy miscalibration risks.

Sources

China's DeepSeek Says Its Hit AI Model Cost Just $294,000 To Train
msmash 2025.09.18 75% relevant
DeepSeek’s Nature paper claims its R1 reasoning model was trained for $294k on 512 Nvidia H800s, contrasting with prior >$100M figures cited by OpenAI; this supports the thesis that state‑of‑the‑art capability can emerge without massive compute growth.
Links for 2025-08-11
Alexander Kruel 2025.08.11 100% relevant
The roundup quotes @arithmoquine noting GPT‑5 was likely < a 'GPT‑4.5' compute scale‑up yet shows 'crazy good' performance, with big data centers still to come.
← Back to All Ideas