PyData Boston 2025

Building Production RAG Systems for Health Care Domains : Clinical Decision
2025-12-10 , Thomas Paul

Building on but moving far beyond the single-specialty focus of HandRAG, this session examines how Retrieval-Augmented Generation can be engineered to support clinical reasoning across multiple high stakes surgical areas, including orthopedic, cardiovascular, neurosurgical, and plastic surgery domains. Using a corpus of more than 7,800 clinical publications and cross specialty validation studies, the talk highlights practical methods for structuring heterogeneous medical data, optimizing vector retrieval with up to 35% latency gains, and designing prompts that preserve terminology accuracy across diverse subspecialties. Attendees will also learn a three-tier evaluation framework that improved critical-error detection by 2.4×, as well as deployment strategy such as automated literature refresh pipelines and cost-efficient architectures that reduced inference spending by 60% that enable RAG systems to operate reliably in real production healthcare settings.


Outline
Minutes 0–4: Introduction – The Challenge of Specialized RAG
Why general-purpose RAG falls short in high-stakes surgical domains
The need for multi-specialty clinical reasoning support
Overview of the 7,800+ publication corpus
Key challenges in healthcare AI: terminology, context, and heterogeneity

Minutes 4–10: Data Pipeline & Preprocessing
Ingesting and processing thousands of clinical publications
Handling diverse formats: PDFs, XML, plain text
Chunking strategies for preserving context: tables, figures, citations
Data cleaning for medical terminology and abbreviations
Python tools: PyPDF2, Beautiful Soup, custom parsers

Minutes 10–16: Embeddings & Vector Database Architecture
Comparing embedding models for medical text (OpenAI, sentence-transformers, domain-specific)
Vector database selection and indexing strategies (Pinecone, Weaviate, ChromaDB)
Query optimization for precise medical terminology
Handling synonyms, abbreviations, and heterogeneous data
Cost vs. performance tradeoffs and latency optimizations (up to 35% gains)

Minutes 16–22: Prompt Engineering & LLM Integration
Designing prompts to maintain subspecialty terminology accuracy
Few-shot examples for clinical context
Citation and source attribution strategies
Resolving conflicting information across papers
Temperature and parameter tuning for factual responses
Cost optimization: caching, batching, and inference reduction (60% savings)

Minutes 22–27: Evaluation & Production Deployment
Multi-tier evaluation framework for critical-error detection (2.4× improvement)
Clinical validation with subject matter experts
Accuracy vs. recall tradeoffs in healthcare
Error analysis and monitoring in production
Automated literature-refresh pipelines and continuous improvement
Ethical considerations and bias detection

Minutes 27–30: Q&A and Key Takeaways
Summary of actionable methods for building robust, domain-specific RAG applications
Key technologies: Python, LangChain, Hugging Face Transformers, Vector DBs, OpenAI API, AWS Lambda


Prior Knowledge Expected: Previous knowledge expected

I am Nikunj Doshi a Cloud, Data & AI Consultant, entrepreneur, and startup founder passionate about empowering tomorrow’s leaders. I hold a Master’s in Information Systems from Northeastern University and a Bachelor’s in Information Technology from Thadomal Shahani Engineering College, which have equipped me with a blend of technical expertise and management skills to drive innovation in cloud computing, DevOps, data analytics, and automation.

As the Founder of Achievers Astra, I have guided over 1,500 international students through career development workshops, personalized mentorship, and strategic planning for professional success. As the Director & Regional Head for North America at Abroad Aashaye, I have helped 5,000+ students navigate the U.S. academic journey and built global partnerships to enhance opportunities for international education.

In my corporate career, I worked with Red Hat as a Cloud Solutions Architect and Cloud Site Reliability Engineer, gaining hands-on experience with AWS, OpenShift, distributed cloud architectures, and large-scale automation. My technical toolkit includes Java, Python, C/C++, R, AWS Cloud services, Microsoft Azure, MongoDB, MySQL, Selenium, Ansible, and Git.

I am passionate about fostering engineering excellence, mentoring future leaders, and contributing to discussions on technology, innovation, and global youth leadership. I have been recognized as a LinkedIn Top Voice and invited as a guest speaker at leading universities, sharing insights on career growth, cloud technologies, and developer culture.

Shikhar Patel, an AI Data Engineer at Mass General Brigham. Shikhar brings extensive hands-on experience in building LLM and RAG systems for healthcare, particularly in clinical decision support.