2025-03-01 –, Main Hall (LH 111)
Transparency and accountability initiatives, both locally and globally, often rely on critical data trapped in messy, inconsistent spreadsheets—hindering collaboration and scalability for civic actors. At the Civic Literacy Initiative (https://civicliteraci.es), we set out to tackle this challenge by helping the Extractive Industries Transparency Initiative (https://eiti.org/) unlock the potential of their summary data files—with a particular focus on data covering state-owned enterprises such as their payments to government and other disclosures related to oil, gas and mining activities.
In this talk, we will explore how we transformed EITI’s data on state-owned enterprises from multiple spreadsheets into a fully accessible data portal and API at https://soe-database.eiti.org. Using Python as our cornerstone, we’ll walk through the end-to-end process that include:
- cleaning and standardizing datasets with libraries such as pandas,
- creating reproducible workflows with Jupyter notebooks,
- building reusable tools on Streamlit,
- publishing structured data with Datasette, and
- current/future improvements to the pipeline that we are working on.
Along the way, we will discuss the unique role in the open data ecosystem of organizations such as EITI that curate the data but are not necessarily the data owner/creator, the constraints and challenges that come with that role, and what we learned about creating user-friendly, maintainable civic data tools that can serve as the foundation for others.
Whether you’re a beginner or experienced Pythonista, data enthusiast, or open data advocate, this talk is sure to provide some insights that can—hopefully—inspire you to build impactful solutions that unlock the power of data for public good.
Beginner
Category:Open Source
Ben is a problem-solver who has a wealth of experience as an advocate, educator, and leader in the open data and open geospatial spaces—equally adept at developing technical solutions, leading capacity building initiatives, or developing communities of practice. He is a big believer in digital privacy and right-to-repair.
He is the proprietor of BNHR and the co-founder of the Civic Literacy Initiative and SmartCT. Find him at:
- https://bnhr.xyz
- https://fb.com/bnhr.xyz
- https://fosstodon.org/@bnhrdotxyz
- https://civicliteraci.es