Have you ever wondered how machines “think” or “reason”? AI models, particularly large language models (LLMs), have become increasingly adept at complex tasks. Yet, models known as ‘thinking models’ take it a step further, constructing reasoning pathways that provide deeper insights into their decision-making processes. In the recent Google for Developers video titled “How do thinking and reasoning models work?”, Nikita Namjoshi explores how scaling laws and test-time compute improve the reasoning capabilities of these models. Opening with scaling laws, it delves into how more data, compute, and parameters enhance LLM performance, a relationship sustained in transformer development. The video examines strategies like test-time compute and the “chain of thought” prompting, allowing models to better tackle complex problems. By fomenting a process similar to showing work in problem-solving, LLMs are enabled to deliver more accurate answers. This was demonstrated with a math problem where chaining thought steps led the model to correct results, showing how modeling token generation enhances reasoning. The video then discusses a post-training method using reinforcement learning, which further enhances the model’s reasoning ability. Reinforcement learning involves subjecting the model to verifiable rewards, reinforcing paths that lead to correct outcomes. This process, often combined with supervised fine-tuning, adjusts the model to handle specific tasks beyond general token prediction. Finally, Namjoshi discusses strategies like the best of n responses, offering a basis for generating accurate answers through multiple attempts. Although using test-time compute holds potential, its applicability varies. The idea is parallel to making several attempts at a single problem, akin to repeated trial-and-error in baking, sometimes hitting the recipe’s sweet spot. As AI continuously evolves, thinking models merge extended compute with adaptive learning, shaping them into entities capable of nuanced thinking. This integration of strategies showcases the pinnacle of AI thinking patterns. To explore further, the video provides resources like Gemini’s thinking tutorials and research papers. The overview by Namjoshi in “How do thinking and reasoning models work?” invites insights into a future where AI is not just reactive but proactive, with complexity akin to human reasoning. While the video excels in illustrating AI advancements, deeper examination into challenges like resource optimization could enrich our understanding of sustainable AI advancements.

Google for Developers
Not Applicable
December 13, 2025
Gemini thinking
PT13M26S