2025-11-02 –, One and Only
Wikidata has the potential to be a global record of political leadership, but keeping this information current, verifiable, and globally inclusive is a challenge. PoliLoom offers a pipeline that pairs AI-assisted extraction with human oversight to address this need.
PoliLoom is an experiment in “verification-first AI” for Wikidata. The project extracts candidate statements about politicians from Wikipedia using large language models, reconciles them to Wikidata items with similarity search, and presents each claim with archived source text and highlighted proof lines. Through a review interface tied to MediaWiki OAuth, contributors approve or reject claims with clear evidence.
This session will show how PoliLoom combines automation with human verification to maintain accurate, time-bound records of political office. It demonstrates a scalable pipeline that processes full Wikidata dumps, handles semantic reconciliation, and generates structured statements ready for integration. The talk will highlight how this work supports developers, GLAM professionals, researchers, language advocates, community organizers in building a verifiable and inclusive democratic memory on Wikidata.
What the session will cover:
Pipeline overview: three-pass Wikidata dump import, LLM-based extraction from Wikipedia, and embedding-based reconciliation of positions, birthplaces, and dates.
Verification workflow: review interface with archived source text, highlighted “proof lines,” and MediaWiki OAuth login for accountability. Planned improvements include requiring reasons for discarding statements.
Multi-source expansion: while initial work is on English Wikipedia, we intend to add other languages and government portals as sources.
Data validation: each review action creates a dataset of confirmed or rejected claims, generating a “golden set” that can be used to evaluate and improve model performance.
Community engagement: ideas under discussion include lightweight gamification (leaderboards, contributor levels, rewards) and prioritization of higher-importance data (e.g. birth dates, current offices). Approvals could be weighted by reviewer experience to balance quality with participation.
Demo: a working test version and a short demonstration video will be available, showing how PoliLoom processes statements, routes them to contributors, and pushes results back into Wikidata.
Video example: https://streamable.com/66ylh3
Github: https://github.com/opensanctions/poliloom/
Forum Discussion: https://discuss.opensanctions.org/t/poliloom-loom-for-weaving-politicians-data/121/18
Head of Customer Success & Community Management @ OpenSanctions.org
I love turning complex data challenges into elegant software solutions. Currently hooked on mastering emerging technologies to solve real-world problems.I love turning complex data challenges into elegant software solutions. Currently hooked on mastering emerging technologies to solve real-world problems.
