2019-09-02 –, Track 2 (Baroja)
This tutorial is an introduction to pandas for people new to it. We will cover how to open datasets, perform some analysis, apply some transformations and visualize the data
pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way toward this goal.
This tutorial will use couple of example data sets to show what pandas can do, and get an idea on how to work with data using pandas.
It is recommended to bring your own laptop with the latest version of Anaconda, pandas, Jupyter, and the repository of the tutorial cloned. See the exact instructions here: https://github.com/datapythonista/pandas-tutorials
Introduction to pandas tutorial by Marc Garcia, pandas maintainer
Python Skill Level –basic
Domain Expertise –none
Domains –Data Visualisation, Statistics
Marc Garcia is a pandas core developer and Python fellow.
He has been working in Python for more than 12 years, and worked as data scientist and data engineer for different companies such as Bank of America, Tesco and Badoo.
He is a regular speaker at PyData and PyCon conferences, and a regular organizer of sprints.