2026-08-13 –, Room 4
Despite being a high-performance language, Julia's I/O functionality has been designed for convenience, and is neither robust nor efficient. I present an alternate I/O interface in BufferIO.jl, which has more well-defined semantics, and permits low level, high performance I/O operations.
Bioinformatics is rife with file formats, and as such, I/O and parsing is a bottleneck for bioinformatics workflows, and therefore a major concern for BioJulia. Unfortunately, Base Julia provides few functions for efficient I/O, and what functionality exists is severely underspecified. Historically, BioJulia packages has worked around this by reading data into a Vector{UInt8} in bulk, and then re-implementing various Base functionality by operating on the buffer using packages like TranscodingStreams, BufferedStreams and Automa. This ad hoc approach improved I/O performance over Base, but did not develop into any coherently designed buffered IO API, and many inefficiencies remained.
Taking inspiration from Rust's BufRead APIs, BufferIO.jl provides a new, reimagined I/O interface. It uses a simple, low-level core set of methods, concrete types, and well-defined semantics in order to allow reliably high performance. Combined with StringViews.jl and MemoryViews.jl, performance in simple I/O benchmarks almost matches Rust's. The package has been used in a couple of BioJulia packages, and have proved a good foundational package to build abstractions on.
Some problems remain: Because BufferIO.jl has not re-implemented basic functionality such as IOStream or Base Julia's file system operations, but wraps these Base I/O objects, BufferIO does not truly shield users from the inefficiency and unreliability of Base Julia.
In this talk, I will motivate and demonstrate the new BufferIO.jl package, and showcase examples of how a new interface can improve the experience of writing high performance I/O code in Julia.
I am a research software engineer from Copenhagen, Denmark.
I currently work for the Danish health authorities, writing software for pathogen surveillance. I am trained as a molecular biologist, and have previously been working as an academic researching bioinformatics.
I program in Python, Rust and Julia, and am an active developer in the BioJulia ecosystem. I write Julia packages for efficient I/O and parsing, and foundational bioinformatics functionality such as BioSequences and Kmers.jl.