PyData Boston 2025

Surviving the Agentic Hype with Small Language Models
2025-12-10 , Thomas Paul

The AI landscape is abuzz with talk of "agentic intelligence" and "autonomous reasoning." But beneath the hype, a quieter revolution is underway: Small Language Models (SLMs) are starting to perform the core reasoning and orchestration tasks once thought to require massive LLMs. In this talk, we’ll demystify the current state of “AI agents,” show how compact models like Phi-2, xLAM 8B, and Nemotron-H 9B can plan, reason, and call tools effectively, and demonstrate how you can deploy them on consumer-grade hardware. Using Python and lightweight frameworks such as LangChain, we’ll show how anyone can quickly build and experiment with their own local agentic systems. Attendees will leave with a grounded understanding of agent architectures, SLM capabilities, and a roadmap for running useful agents without the GPU farm.


“Agentic AI” has quickly become one of the most overused phrases in tech. Every week brings new claims about autonomous systems, but most boil down to the same fundamental idea: an agent is a system that uses language models to plan and execute. It breaks a task into steps, uses tools to accomplish them, and writes code, usually Python, to bridge the gaps.

Over the last 3 years, when chain-of-thought models were first introduced, these abilities were assumed to require large, generalized language models with tens or hundreds of billions of parameters. But recent studies from NVIDIA and Georgia Tech challenge that assumption. Their findings show that compact language models models like xLAM 8B, Phi-2 (2.7B), and Nemotron-H 9B can reason, orchestrate tools, and generate code nearly as effectively as their 10x larger cousins - while being small enough to run on consumer-grade hardware.

Agents aren’t just for huge research labs anymore. Using Python and frameworks like LangChain or LlamaIndex, developers can now build, test, and iterate on agent architectures on their local machines. With just a few lines of Python, you can connect a small model to external tools (APIs, databases, scripts), watch its reasoning steps, and experiment with different prompting strategies.

This talk explores what this shift means for the future of agents. We'll walk through:
- The three pillars of agents: planning, tool use, and code generation.
- How Small Language Models (SLMs), combined with Python orchestration frameworks, enable local experimentation.
- A live demo showing a small model performing real-world agentic reasoning on a standard laptop, using LangChain to chain together reasoning and tool calls.
- Practical trade-offs between accuracy, speed, and hardware resources.

We'll conclude with a look ahead at:
- Opportunities for on-premise AI and privacy-first reasoning.
- How developers can use these accessible tools to stay relevant in the evolving AI landscape.

Attendees will leave with a practical understanding of how to build, evaluate, and deploy small-model agents using Python and LangChain, along with the skepticism needed to navigate the ongoing hype cycle.

===

Bullet-Point Outline

0–5 min: The Agentic Buzz - What’s Real, What’s Marketing
- The explosion of “agentic” frameworks and the confusion it causes
- What an agent really is at its core: planning, acting, and reasoning

5–15 min: Anatomy of an Agent
- The three basic functions: task decomposition, tool use, and code synthesis
- How frameworks like LangChain and Python make it easy to chain these together

15–25 min: Why Small Models Are Catching Up
- Review of research from NVIDIA and Georgia Tech
- Benchmarks showing SLMs matching or exceeding performance of larger LLMs
- Cost, latency, and deployability tradeoffs

25–35 min: Hands-On Demo: Building and Running an Agent on a Laptop
- Using LangChain and Python to orchestrate reasoning, tool calls, and code execution
- Example workflow: “Plan a dataset cleanup pipeline” using an SLM
- Observing resource use, latency, and performance in real time

35–40 min: Key Takeaways and Open Research Directions
- Opportunities for local and edge deployments
- The emerging role of SLMs in allowing everyone to experiment with agents
- Future questions: scaling reasoning vs. scaling models


Prior Knowledge Expected: Previous knowledge expected

Serhii Sokolenko is a co-founder of Tower.dev. Tower orchestrates Python-native workflows and offers management tools for data lakehouses. Prior to founding Tower, Serhii worked at Databricks, Snowflake and Google on data processing and databases.