2025-03-02 –, F-AVR
This session introduces PandasAI, a Python library that leverages large language models to streamline data tasks from processing and cleaning to visualization and feature creation through conversational interfaces.
You will learn how PandasAI simplifies workflows by allowing you to query your data and generate analyses without diving deep into complicated code. We will explore real-world examples, discuss best practices, and address potential challenges. By the end of this session, you will have a clear understanding of how to apply conversational data analysis to your projects, making your data work more intuitive and efficient.
Are you ready to experience a paradigm shift in data analysis brought by generative AI? Instead of writing complex analytical code, imagine interacting with your data in plain natural language.
Who Should Attend?
- Individuals using Python or SQL for data analysis who want to reduce their coding workload by leveraging generative AI
- Analysts or developers curious about querying data via natural language instead of complex SQL or Python scripts
- Beginners to intermediate practitioners looking to automate and streamline their data analysis tasks, even if they are not coding experts
Goals:
- Understand the paradigm shift from writing complex analytical code to conducting data analysis through natural language interactions
- Explore the capabilities of PandasAI in simplifying data processing, cleaning, visualization, and feature engineering
- Learn best practices for integrating PandasAI in real-world analysis tasks, ensuring efficient and reliable results
Talk Outline:
- Introduction & Context (3 min)
- The potential of generative AI in data analysis
- A brief overview of PandasAI
- Setting expectations: understand the interactive workflow with data
- PandasAI Overview & Key Features (10 min)
- Natural language querying, data visualization, and cleaning functionalities
- Core components: SmartDataframe, SmartDatalake, and Agent
- How PandasAI integrates LLMs into data analysis tasks
- Live Demo (9 min)
- Demonstrating natural language queries on sample datasets and interpreting results
- Explaining the code-generation process behind the scenes
- A real-world use case: analyzing sales data, comparing stores, and identifying growth trends
- Visualizing results and discussing underlying code examples
- Tips, Challenges & Best Practices (5 min)
- Managing logs and privacy considerations
- Ensuring reproducibility (temperature settings, seed fixing)
- Safe usage: implementing whitelists and controlling what code can be generated
- Wrap-up & Q&A (3 min)
Beginner
Category:Machine Learning and Artificial Intelligence
He is a Research Engineer at a telecommunications company in Japan, specializing in developing web applications for data analysis. He is actively involved in the development of “Node-AI” (https://nodeai.io/), a no-code AI development tool designed for time-series data analysis.
Data scientist