Python Conference APAC 2024

Extracting Structured Data from LLMs with LangChain and Pydantic
2024-10-27 , CLASS #5
Language: English

This talk dives into the exciting realm of enriching your Large Language Model (LLM) interactions with structured data extraction. We'll explore how LangChain, in conjunction with Pydantic, empowers you to retrieve not just plain text from LLMs but also reusable Python objects like lists, dictionaries, and even pandas DataFrames.


Join me in this journey to understand and implement structured data extraction. With this powerful combination of tools, you'll learn how to craft data models using Pydantic's BaseModel for seamless integration with LangChain's output parser. Next, how to extract valuable information from LLM responses in structured formats like( lists, data frames), enabling further analysis and manipulation. And finally build supercharge LLM applications that require structured data transformations, parsing, or integration with machine learning models.
Whether you're a data scientist, developer, or just curious about the possibilities of LLMs, this talk equips you with the skills to unleash the structured power of LLMs and build innovative applications.

  1. Introduction:
    Highlighting limitations of plain text LLM responses.
    Introducing structured data extraction from LLMs.
    Exposure to LangChain and Pydantic's power.

  2. Building the Data Model:
    Demo of defining data models with Pydantic's BaseModel.
    Exploring lists, dictionaries, and pandas DataFrames.
    Understanding data model interaction with LangChain's parser.

  3. Structured Data Extraction:
    Live examples of querying LLMs for structured data.
    Transforming data for analysis.
    Integrating data with ML models.

  4. Real-World Applications:
    Practical use cases of structured data extraction.
    Benefits and potential applications discussion.

  5. Conclusion:
    Recap of key learnings and future directions.
    Exciting possibilities ahead.

Hello, this is Kalyan from India. I started my career as a newspaper delivery boy, and through hard work and determination, I evolved into a self-taught data scientist and analytics manager. And, I use to lead a talented data science and analytics team in my previous company. Currently I am a freelance Data & AI scientist!
I'm deeply passionate about open-source communities and actively contribute to them. Over time, I've established myself as a respected global speaker and influential community leader, delivering talks at prestigious conferences and educational institutions such as PyData Global, Data Science Global Summit 2022, JupyterCon, PyCon JP, PyCon India, Devfest Hyderabad, PyCon APAC, PyCon Hong Kong, PyCon ZA, Pyjamas, Conf42, Developer Conference Telangana 2021, BelPy & KLS Gogte Institute of Technology, Belagavi, Karnataka, India.
I also worked as Reviewer and Mentor for reputed conferences & hackathons including EuroPython, SciPy, PyData, PyData Seattle, JupyterCon, PyCon US, PyCon India, PyConfHyderabad, and many others. (At the moment, assisting the EuroPython 2024 Proposal Mentorship program.
Kalyan is also contributing to various open-source communities. He enjoys being involved with these communities and helping them grow. Currently I am associated with the following organizations below:
NUMFOCUS - Small Development Grants Review Committee
PyCon India – Conference Co-chair
PyConf Hyderabad – Conference Co-chair
Kaggle X Bipoc Mentorship - Mentor
PyData Global Impact Mentoring Program - Mentor
Hyderabad Python Users Group – Core Member/ Meetups Organizer
Humans for AI – Program Manager for AI learning Community

This speaker also appears in: