2025-12-08 –, Abigail Adams
In this 90 minute tutorial we'll get anyone with some basic Python and Command Line skills up and running with their own 100% laptop based set of LLMs, and explain some successful patterns for leveraging LLMs in a data analysis environment. We'll also highlight pit-falls waiting to catch you out, and encourage you that your pre-GenAI analytics skills are still relevant today and likely will be for the foreseeable future by demonstrating the limits of LLMs for data analysis tasks.
OVERVIEW
Are we in a one-size-fits-all world of LLMs, and destined to burn tokens forever via API key access to third party LLM services? There are some great reasons to keep using 3rd party LLM services, however many data practitioners want and need a degree of control, autonomy, and maintained data sovereignty. This means diving into the world of self-managed LLMs, and a starting point for that is the Ollama framework for local model management, and the Hugging Face model repository.
ADVANCE SETUP REQUIRED
Please do this BEFORE coming to the conference -- the event WiFi and your fellow conference/tutorial attendees will not thank you for trying to download 10GB of LLMs.
Follow advance setup process described here:
https://github.com/ijstokes/tutorial-local-llm
OUTCOMES
This tutorial will get you up and running with basic local LLMs, and get you experience and insight in the following:
- Mechanics of using Ollama to fetch and run models from the CLI and API
- Understanding trade-offs of self-managed vs SaaS-hosted LLMs
- Picking a fit-for-purpose LLM from the Hugging Face model repository
- Creating customized derivative LLMs, more targeted for your own purposes
- Basics of configuring LLM interactions for your desired performance criteria
- Understanding mechanisms for multi-model intput and structured output beyond text input and text output
- Use of several Python libraries for interacting with LLMs, including
ollama, andopenai - Strategies for using LLMs for data analysis tasks
PRE-REQUISITES
- Some Python experience
- Some command line experience (ideally Bash/Unix shell)
- Some Git experience
- Some Jupyter experience
- Laptop with at least 20GB free storage space (may be able to get away with 10 GB)
- Laptop with at least 8GB (will be tight, ideally 16GB or more)
AGENDA
- Welcome, concept overview and target tutorial outcomes (5 minutes)
- Local environment setup (5 minutes, with background package downloading)
- Why would you want to run you LLMs locally? Is it even possible? How do AI Appliances fit in? (10 minutes)
- Navigating HuggingFace and finding the LLM of your dreams (5 minutes + 10 minute exercise)
- Using Ollama from the GUI and from the CLI (5 minutes + 10 minute exercise)
- Using the Ollama Python library and leveraging it from code (5 minutes + 10 minute exercise)
- Offloading LLM to an AI Appliance (5 minutes + 5 minute exercise)
- Next steps for AI Autonomy, Data Sovereignty, and Advanced Analytics in the age of GenAI and LLMs (5 minutes)
- Q&A (10 minutes)
FURTHER DETAILS
This tutorial will get you comfortable navigating the HuggingFace model repository, and using the Ollama ecosystem for local model management and hosting. You'll learn how together these are the GitHub and Git of the LLM world. While we may not be able to get those LLMs behaving in a repeatable, deterministic way at least you'll have the tools to choose exactly which LLMs you're using and an ability to retain data sovereignty by self-hosting LLMs as part of your overall workflow. You'll leave with an expanded set of options for leveraging LLMs that are running off your laptop, off your own cloud server, or on affordable AI Appliances such as an Nvidia DGX Spark or Nvidia Jetson Orin Nano Super.
Ian is a Computational Scientist and Software Engineer. His current role is "Partner in AI Engineering" with BCG, a global management consulting firm. He works with BCG's clients around the world to identify opportunities to combine data, technology, and analytics to create step change capabilities in their organization. What does that translate to? On a day-to-day basis it means leading crack teams of BCG engineers and data scientists in the development of AI-driven and (typically) Python-based bespoke solutions which leverage the best tools, technology, and techniques available.
Prior to BCG, Ian was a Product Manager at Anaconda, and has been in the Python community for over 20 years. Ian has a PhD from Oxford where he worked on the CERN LHCb experiment and developed the Python-based distributed computing middleware that managed 10 million queued physics jobs to schedule across a quarter million servers in a globally federated compute environment. He also spent several years at Harvard Medical School collaborating with bio physicists on novel techniques for protein structure discovery.
Ian is a member of the Python Software Foundation and the Open Source Initiative. In his free time he enjoys sailing, cycling, xc skiing, and motorbiking. On rainy days he'll pull out a board game to play with his wife & kids: Ark Nova, Ticket To Ride, and Takenoko are current favorites.