30 April 2025 12:10 - 12:35
Scaling paradigms for large language models
In this talk, Jason will tell you how scaling has been the engine of progress in AI for the past five years.
In the first scaling paradigm, our field scaled large language models by training with more compute on more data. Such scaling led to the success of ChatGPT and other AI chat engines, which were surprisingly capable and general purpose.
With the release of OpenAI o1, we are at the beginning of a new paradigm where we do not just scale training time compute, but we also scale test-time compute. These new models are trained via reinforcement learning on chain-of-thought reasoning, and by thinking harder for more-challenging tasks can solve even competition-level math and programming problems.
Jason will conclude with a few remarks on how AI research culture has changed and where the field might go next.