PyCon DE & PyData 2025

Machine Reasoning and System 2 Thinking
2025-04-24 , Palladium

Raw large language models struggle with complex reasoning. New techniques have
emerged that allow these models to spend more time thinking before giving an answer.
Direct token sampling can be seen as system-1 thinking and explicit step-by-step
reasoning as system-2. How can this reasoning ability be improved and what is the future?


Basic large language models struggle with complex reasoning. New techniques, broadly referred to as "test time compute" have emerged that allow these models to spend more time processing before giving an answer. Direct token sampling can be seen as analogous to system-1 thinking and explicit step-by-step reasoning as system-2. Many top AI researchers and companes are now working on building system-2 into AI systems to improve general reasoning.

We will review the newest open research on test time computation including promising techniques that have appeared in top entries for François Chollet's ARC-AGI challenge. While OpenAI has shamefully kept the research behind their o1, o3 and o-N models secret, other researchers have worked in public, demonstrating how to use test time compute to greatly boost model performance with the right fine-tuning and test time procedures.

This talk will explore the latest developments in the rapidly developing area of system-2 AI reasoning, the engine behind the only significant gains in LLM performance recently. Giving LLMs system-2 like capabilities improves problem solving, code generation quality and reduces hallucinations, get up to speed on research behind these techniques.


Expected audience expertise: Domain:

None

Expected audience expertise: Python:

None

Andy Kitchen is a AI/neuroscience researcher, startup founder, and all-around hacker. He co-founded Cortical Labs where the team taught live brain cells to play pong. He's still trying to figure out how to catch the ghost in the machine.