, Dynamicum [Ground Floor]
You’ve likely used a tool like black, flake8, or ruff to lint or format your code, or a tool like sphinx to document it, but you probably do not know how they accomplish their tasks. These tools and many more use Abstract Syntax Trees (ASTs) to analyze and extract information from Python code. An AST is a representation of your code's structure that enables you to access and manipulate its different components, which is what makes it possible to automate tasks like code migrations, linting, and docstring extraction.
In this workshop, you’ll learn how to use the Python standard library’s ast module to parse and analyze code. Using just the standard library, we will implement a couple of common checks from scratch, which will give you an idea of how these tools work and help you build the skills and confidence to use ASTs in your own projects.
This tutorial will be a roughly 50/50 split of lecture and exercises. Attendees will get hands-on experience working with ASTs in Python, using only the standard library. By recreating common code-quality checks from scratch, attendees will both learn how common tools work under the hood and how to work with the AST in an easy-to-understand fashion.
Topics covered:
- Introduction to the term and concept of Abstract Syntax Trees (ASTs)
- Some of the ways ASTs are used by Python itself and by popular tools
- Parsing code into an AST and inspecting it
- Walking the tree: ast.iter_fields(), ast.iter_child_nodes(), ast.walk()
- Modifying the code before running it
- Converting an AST into source code again with ast.unparse() and its caveats
- Finding missing docstrings
- ast.NodeVisitor and ast.NodeTransformer
- generic_visit() method — what it does and why we need it using animation
- 4 exercise breaks spread throughout accounting for ~45 minutes
Stefanie Molin is a software engineer at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also a core developer of numpydoc and the author of “Hands-On Data Analysis with Pandas: A Python data science handbook for data collection, wrangling, analysis, and visualization,” which is currently in its second edition and has been translated into Korean and Chinese. She holds a bachelor’s of science degree in operations research from Columbia University's Fu Foundation School of Engineering and Applied Science, as well as a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.