2025-07-25 – 15:30-16:00 (Africa/Abidjan), Main Room 6
FlexiJoins.jl offers unparalleled flexibility in data joining – both within the Julia ecosystem and beyond. It supports a wide variety of join conditions and options through efficient algorithms, can operate on both in-memory collections and SQL tables. In this talk, I'll demonstrate and explore the uniform user-facing interface, and discuss the underlying design of the package that leverages Julia dispatch capabilities.
FlexiJoins.jl is crafted to be the most flexible and generic package for dataset/table joining – and not just among Julia libraries. Thanks to Julia's multiple dispatch, it achieves this while remaining user-friendly and efficient.
In particular, FlexiJoins provides the following with a uniform interface and performantly – without falling back to nested loop joins:
- A wide range of join conditions: from simple equality to intervals, ranges, and arbitrary distance measures, including combinations of these;
- Various join options: all matches or the closest match, left or right joins, flat results or grouped by one side;
- Compatible with a variety of dataset types: Julia collections such as Vectors or Dictionaries, specialized table types like DataFrames, and experimental support for SQL databases through SQLCollections.jl.
In the talk, I'll explore the overall design of FlexiJoinsthat allows for such flexibility. The package uses asymptotically optimal algorithms (hash-/sort-/tree-based), ensuring no join operation falls back to the naive O(n^2) path by default. Looking ahead, incorporating heuristics for specific cases can be useful to stay competitive with heavily optimized specialized join implemenetations.
FlexiJoins is applicable in a wide range of scenarios already, and I invite feedback on its interface and potential extension points to support additional use cases.
Astrophysicist – Postdoctoral Fellow at Harvard University.