A cited analysis claims GPT‑5 achieved major capability gains with less pretraining compute than the 100× jumps seen from GPT‑2→3→4. If true, scaling laws may be loosening: architecture, data, and training tricks are delivering outsized improvements without proportional compute growth.
— This challenges timeline models and energy/planning assumptions that equate progress with massive compute ramps, implying faster‑than‑expected capability diffusion and policy miscalibration risks.
msmash
2025.09.18
75% relevant
DeepSeek’s Nature paper claims its R1 reasoning model was trained for $294k on 512 Nvidia H800s, contrasting with prior >$100M figures cited by OpenAI; this supports the thesis that state‑of‑the‑art capability can emerge without massive compute growth.
Alexander Kruel
2025.08.11
100% relevant
The roundup quotes @arithmoquine noting GPT‑5 was likely < a 'GPT‑4.5' compute scale‑up yet shows 'crazy good' performance, with big data centers still to come.
← Back to All Ideas