GigaSOM.jl: Huge-scale, high-performance flow cytometry clustering in Julia
2019-07-25, 17:05–17:15, Elm A

Flow cytometry clustering for several hundred million cells has long been hampered by software implementations. Julia allows us to go beyond these limits. Through the high-performance GigaSOM.jl package, we gear up for huge-scale flow cytometry analysis.


Recent advances in single-cell technologies offer an unprecedented opportunity to comprehensively characterize the immune system, revealing a previously unparalleled complexity in the phenotype and function of immune cells. Mass cytometry, also known as CyTOF, was recently implemented to measure up to 40 different markers in several million single cells. A typical clinical study with hundreds of patients can therefore include billions of single cells (rows) and up to 40 markers (features).
Different dimension reduction methods have been implemented in commercial and open-source software, mainly written in R. The machine learning algorithm FlowSOM [1] is based on the famous Kohonen Self Organising Feature Maps (SOM) [2] and has shown various advantages over other methods.
However, all current implementations have a critical limitation on the total number of cells to be analyzed . This limitation often blocks the analysis of large-scale clinical studies with several hundred million cells.
Here, we present the open-source, high-level, and high-performance package GigaSOM.jl (https://github.com/LCSB-BioCore/GigaSOM.jl), which is HPC-ready and is written to handle very large datasets without limits. Julia is the natural language of choice when it comes to performing huge-scale cytometric analyses. With the GigaSOM.jl package, the possibilities for flow cytometry analysis are further broadened. The quality of the software package is assured using ARTENOLIS (https://artenolis.lcsb.uni.lu) [3]. Biological validation of the results will be performed on downsampled datasets by comparison to conventional implementations of the FlowSOM package and manual hierarchical analysis.

References

[1] Sofie Van Gassen, Britt Callebaut, Mary J. Van Helden, Bart N. Lambrecht, Piet Demeester, Tom Dhaene and Yvan Saeys. FlowSOM: Using self-organizing maps for visualization and interpretation of cytometry data. Cytometry A 2015, volume 87.7 (p. 636-645)
[2] Kohonen T. The self-organizing map. Proc IEEE 1990;78:1464–1480
[3] Heirendt, Laurent; Arreckx, Sylvain; Trefois, Christophe; Yarosz, Yohan; Vyas, Maharshi; Satagopam, Venkata P.; Schneider, Reinhard; Thiele, Ines; Fleming, Ronan M. T., ARTENOLIS: Automated Reproducibility and Testing Environment for Licensed Software, arXiv:1712.05236.


Co-authors – oliver.hunewald@lih.lu, antonio.cosma@lih.lu, christophe.trefois@uni.lu, fanny.hedin@lih.lu, vasco.verissimo@uni.lu, reinhard.schneider@uni.lu, markus.ollert@lih.lu, feng.he@lih.lu