2022-10-15 –, pyconjp_2
Language: English
Mono-repository or not? That is a boggling question for many medium-to-large-sized development teams. As a growing company, we had to onboard new hires quickly while coping with flooding customer requests and increasing codebase complexity. We have merged 7 repositories into a single one and migrated to the Pantsbuild system, a Python-friendly modern build system. Here is our story!
- Discussions about mono-repository vs. multi-repository
- Both has its own pros and cons. First let’s review existing discussions about the choices.
- How to define the criteria to merge a repository?
- Problems with the prior art in my team
- Making a single release takes several hours.
- We had to create multiple PRs to different repositories for a single conceptual feature or bugfix. Both code authors and reviewers had difficulties on context switching.
- We had to establish custom practices like synchronizing branch names between different repositories for CI.
- Linking multiple PRs with a single issue on GitHub did not work as we expected.
- We often forgot to switch branches in multiple repository clones of related components while working on a single issue.
- The development-setup script became too complex.
- A short intro about Pantsbuild
- http://pantsbuild.org/
- Why did we choose this? (compared to Bazel, etc.)
- Let’s share a first glance on basic usage
- Migration process with Pantsbuild
- About the new mono-repo directory structure and importing per-package repository
- Customization for our codebase
- I wrote a few custom Pants plugins for
setup.py
generation, towncrier tooling, and a dependency injector for platform-specific prebuilt binaries. - I wrote a custom package entrypoint scanner as no package metadata is available in Pants-based execution environments.
- I wrote a few custom Pants plugins for
- Experience after migration
- Now making a release takes less than 10 minutes.
- The on-site engineering team has confidence with version compatibility of all components as they now share a single unified version number.
- A single issue now has a single unified PR, making GitHub Projects more useful.
- Writing and reviewing a PR across multiple components is now a breeze. We can see all relevant changes including documentation at a single place.
- Recap
- It was a long and difficult journey, which took more than one month.
- But it was worth, and I hope that my experience and customization could help others going for the mono-repo migration with complex Python projects.
- Great community support was a tremendous help during the whole migration process.
Joongi is the creator of Backend.AI and the CTO of Lablup, where he oversees the development of MLOps pipelines and GPU-accelerated AI services. He earned his Ph.D. in Computer Science from KAIST by creating a GPU-accelerated packet processing framework with world-leading speed of 80 Gbps. His major areas of interest include scalable and automated backend systems, as well as their analysis and design. He's also a big fan of open source, having contributed to projects like Python, iPuTTY, Textcube, aiodocker, aiohttp, pyzmq, DPDK, and others.