Juliacon 2024

Structural Bioinformatics with BiochemicalAlgorithms.jl
07-10, 11:30–12:00 (Europe/Amsterdam), If (1.1)

BiochemicalAlgorithms.jl is a redesign of the popular Biochemical Algorithms Library (BALL), the largest open source C++-framework of its kind. We focused on three main design goals: efficiency, ease of use and rapid application development (RAD). Our library provides functionality for file I/O, molecular modelling, molecular mechanics methods, and molecular visualization, and hence can serve as a foundation for developing applications within the Julia ecosystem.


Development times play a crucial role in the process of implementing structural bioinformatics applications. Ideally, frameworks for molecular modelling and simulation are easy and intuitive to use without sacrificing too much computational efficiency or numerical stability. BiochemicalAlgorithms.jl is our attempt of providing such a framework. BiochemicalAlgorithms.jl is a complete Julia-centric redesign of the popular Biochemical Algorithms Library (BALL), the largest open source C++-framework for structural bioinformatics, created in 1996 [1,2].

Switching our development from C++ to Julia has greatly simplified conforming to our design goals in many ways. While setting up a development environment for BALL is a highly nontrivial task these days, installing BiochemicalAlgorithms.jl is trivial. The code in BiochemicalAlgorithms.jl is -- in our opinion -- often much more readable, with greatly reduced boilerplate, and much closer to the abstract algorithmic formulations. The availability of high-quality numerical code for many problems we face in structural bioinformatics also helps to greatly improve code quality and accelerates development.

While the port to Julia is not yet feature-complete, BiochemicalAlgorithms.jl already provides well documented core data structures for typical molecular entities such as atoms, bonds, molecules, etc. These core data structures are accompanied by a selection of molecular modelling and mechanics techniques. Software packages in structural bioinformatics often focus on one or two specific tasks such as the implementation of a force field. In contrast, BiochemialAlgorithms.jl, accompanied by the visualization counterpart BiochemicalVisualization.jl, attempts to cover the full molecular modelling pipeline, containing building blocks such as file import and export of several molecular file formats, methods for inferring missing atoms, bond building algorithms, molecular force fields, structural optimization, and docking algorithms.

BiochemicalAlgorithms.jl's structured interface enables the rapid prototyping of ideas, such as the implementation of new energy functions or docking algorithms, as well as fully featured applications, and is suited for the interoperability with other Julia packages for molecular modelling like Molly.jl [3]. Together with Julia's excellent ecosystem for scientific machine learning, it can form the basis for novel applications of deep learning techniques (using, e.g., Flux.jl[5]) in the context of molecular modelling and simulations, or of Bayesian optimization techniques (using Turing.jl[4]) for molecular docking.

In our talk, we will show how to perform common tasks in structural bioinformatics, such as reading molecular structures from files, preprocessing them to add missing atoms and bonds, inferring atom types, performing structural optimization, and run a docking algorithm. Further, we will demonstrate how our framework can be used to easily implement new methods and algorithms.

References

[1] Hildebrandt, A. et al. BALL - biochemical algorithms library 1.3. BMC Bioinformatics 11, 531 (2010).
[2] Kohlbacher, O. & Lenhof, H.-P. BALL—rapid software prototyping in computational molecular biology. Bioinformatics 16, 815–824 (2000).
[3] Greener, J. G. Differentiable simulation to develop molecular dynamics force fields for disordered proteins. 2023.08.29.555352 Preprint at https://doi.org/10.1101/2023.08.29.555352 (2023).
[4] Innes, M. et al. Fashionable Modelling with Flux. Preprint at https://doi.org/10.48550/arXiv.1811.01457 (2018).
[5] Ge, H., Xu, K. & Ghahramani, Z. Turing: A Language for Flexible Probabilistic Inference. in Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics 1682–1690 (PMLR, 2018).

See also: GitHub

09/2012 - B.Sc. Molecular Biology, Johannes Gutenberg University Mainz
03/2015 - M.Sc. Applied Bioinformatics, Johannes Gutenberg University Mainz

03/2016 - present Research associate in computer science Johannes Gutenberg University Mainz

04/2015 - present PhD candidate in computer science Johannes Gutenberg University Mainz

I am interested in structural bioinformatics and development of software for applications in this and related fields.

This speaker also appears in: