Processing medical images at scale on the cloud
2024-09-26 , Louis Armand 2 - Ouest

The MedTech industry is undergoing a revolutionary transformation with continuous innovations promising greater precision, efficiency, and accessibility. In particular oncology, a branch of medicine that focuses on cancer, will benefit immensely from these new technologies, which may enable clinicians to detect cancer earlier and increase chances of survival. Detecting cancerous cells in microscopic photography of cells (Whole Slide Images, aka WSIs) is usually done with segmentation algorithms, which neural networks (NNs) are very good at. While using ML and NNs for image segmentation is a fairly standard task with established solutions, doing it on WSIs is a different kettle of fish. Most training pipelines and systems have been designed for analytics, meaning huge columns of small individual datums. In the case of WSIs, a single image is so huge that its file can be up to dozens of gigabytes. To allow innovation in medical imaging with AI, we need efficient and affordable ways to store and process these WSIs at scale.


In this presentation, we look at the challenges encountered in processing large-scale medical images, particularly in oncology, where images can be as large as several gigabytes. We discuss the limitations faced by most companies in accessing supercomputing resources and the necessity of cloud-based solutions. The separation of storage and compute, coupled with the complexities of distributed computing, presents significant hurdles. We explore methods and tools to address these challenges, enabling data scientists to iterate and experiment efficiently. Drawing from our experience at Modus Create (formerly Tweag) working with Kaiko.ai, we will share insights into building an AI research platform tailored to the fight against cancer.

  • Introduction, Whole Slide Images and detecting cancer (5 min)
  • The challenge of training Neural Networks on Whole Slide Images on the cloud (5 min)
  • Building blocks for an AI Research Platform for oncology (10 min)
  • Future developments (5 min)
  • Q&A (5 min)

This talk assumes minimal Python and Cloud knowledge. No specific medical knowledge is required.

Senior Software Engineering at Modus Create with an academic background in mathematics, statistics and AI and professional experience in data engineering at scale. Leads the Data Engineering Technical Group of Modus Create's OSPO.