Introduction to pandas
09-02, 14:00–15:30 (UTC), Track 2 (Baroja)

This tutorial is an introduction to pandas for people new to it. We will cover how to open datasets, perform some analysis, apply some transformations and visualize the data


pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way toward this goal.

This tutorial will use couple of example data sets to show what pandas can do, and get an idea on how to work with data using pandas.

It is recommended to bring your own laptop with the latest version of Anaconda, pandas, Jupyter, and the repository of the tutorial cloned. See the exact instructions here: https://github.com/datapythonista/pandas-tutorials


Abstract as a tweet

Introduction to pandas tutorial by Marc Garcia, pandas maintainer

Python Skill Level

basic

Domain Expertise

none

Domains

Data Visualisation, Statistics

Marc Garcia is a pandas core developer and Python fellow.

He has been working in Python for more than 12 years, and worked as data scientist and data engineer for different companies such as Bank of America, Tesco and Badoo.

He is a regular speaker at PyData and PyCon conferences, and a regular organizer of sprints.