Cliff Kerr
Dr. Cliff Kerr is a Senior Software Engineer at the Institute for Disease Modeling, part of the Gates Foundation, where he works on HIV, STIs, tuberculosis, and family planning. Previously, he completed a B.Sc. in neuroscience and a Ph.D. in physics, was a lecturer in scientific computing at the University of Sydney, co-founded two startups (on data analytics and health economics), worked on a DARPA project teaching robots to pick up balls, and developed an algorithm that composes music in real time based on brain activity recordings. He lives in New York.
Session
Scientists apply rigorous methods to their research, but rarely to the AI tools they use to write code. We tested different LLM models in combination with domain-specific tools (including MCP servers and skills) to find the optimal combination for writing complex domain-specific code. We created a quantitative proficiency test for Starsim, a disease modeling framework, and evaluated different combinations of models and tools. While Claude Opus outperformed other models, access to tools improved performance more than choosing the best model. Thus, to improve LLM performance on domain-specific problems, we recommend developing a set of tools with the help of quantitative evaluation.