JuliaCon 2026

juDock: An Open-Source, ML-Driven Platform for Virtual Screening of Phytocompounds in Drug Discovery
2026-08-12 , Room 4

Virtual Screening of phytocompounds in drug discovery has surged over the years. We present juDock, a ML-Driven dockerized Linux application built in Julia. juDock automates the pipeline from the preparation of ligands to the prediction of potential compounds for a specific protein integrating AutoDock Vina, RDKit and Scikit-Learn via PythonCall.jl and Genie.jl. Furthermore, juDock is an open source project attracting researchers to contribute using the established ML pipeline for various proteins.


The Structure of the Talk is as follows:

Background and Problem: (2 mins; Will brief clinical challenges, limitations of the existing tools)
There has been a growing interest in screening for phytocompounds. However, High Throughput Virtual Screening (HTVS) requires computational expertise from downloading the ligands to molecular docking including format conversions. Sometimes, it also demands scripting, producing non-reproducible results. Furthermore, various tools should be used sequentially such as RDKit, OpenBabel, AutoDock Vina, Discovery Studio Visualizer etc.

The Julia Solution: (3 mins; Will talk about the methodologies implemented)
We present juDock, a containerized ML-Driven browser based Linux application that solves this fragmentation. We have used Julia to build a streamlined pipeline, as an ultimate orchestrator optimizing specifically for the virtual screening of phytocompounds. While traditional docking provides physics-based binding affinities, our integrated model provides a complementary ML-based probability score, termed as dockscore. This allows researchers to perform high-throughput screening where the software simultaneously validates spatial docking feasibility and chemical inhibitory potential, processing thousands of phytochemicals in seconds.

Key Highlights: (5 min; Will talk about the application architecture)

  • Phytocompound based Training: We trained the Multi-Output Random Forest Regressor model on a curated phytocompounds dataset docked against 17 beta HSD1 target. The model learned to predict the binding affinity and an overall dockscore based on a combination of binding affinities, molecular interaction profiles (H bonds, Non bond interactions such as hydrophobic interactions etc) and also, physicochemical descriptors of the compounds.

  • Seamless Interoperability: We have used industry-standard libraries such as RDkit, Scikit-learn entirely from within Julia using Conda and PythonCall.jl

  • Robust Multi-Processing: We have applied a RAM-aware parallel processing system using Julia’s Distributed library allowing effective parallel screening of large numbers of natural product libraries without segmentation faults.

  • Full-Stack Browser Based Interface: We have built a responsive GUI dashboard using Genie.jl. The application, while installing, creates directories such as juDock_input and juDock_output in the user’s Home directory. The researchers have to just place the .sdf files in the input directory and the results are available both in the output folder and also, in the browser interface. The interface also provides real-time progress tracking with status bars.

  • Open Source project: The uniqueness of juDock relies on its specificity towards the target protein. The interface allows the user to select the protein target and consequently, the respective model is loaded. Hence, we have created this project as an open-source project allowing researchers to contribute the model trained using our established pipeline. Currently, juDock holds two protein targets, 17 beta HSD1 (trained by us) and Aromatase (Contributed). The complete code is available at https://github.com/drbenedictpaul/judock .

Validation and Conclusion: (2 mins: Will talk about the case studies for validation)
The application was validated against traditional computer aided drug design methods with various types of proteins (Human, bacterial, viral and fungal) and compounds. juDock empowers scientists who are interested in screening various phytocompounds for its therapeutic potential. Combining the power of Machine Learning and the speed of Julia, juDock provides a powerful interface for the High Throughput Virtual Screening of phytocompounds in modern natural product drug discovery.

See also:

Ms. Sekaran is a PhD Scholar in the Department of Biotechnology. Her research is focused on the discovery and validation of novel therapeutic agents for breast cancer. She is currently working on the identification and evaluation of potential inhibitors for the 17-beta-hydroxysteroid dehydrogenase (17β-HSD) enzyme, a critical target in cancer treatment. Her work integrates both computational (in silico) and experimental (in vitro) methods, combining computational screening and inhibitor design with laboratory-based assays for validation. She is particularly interested in applying modern programming languages and high-performance computing to accelerate the drug discovery pipeline.