Sanjit Paliwal PyData Boston 2025

Sanjit Paliwal
.ical

As a Principal Data Scientist at Verizon, I deliver innovative and impactful data solutions for various business units and functions. I have over seven years of experience in data science, with a focus on Machine Learning, Artificial Intelligence, NLP, Gen AI, Time Series analysis, Visualization, Geospatial analysis, and Statistical Analysis (A/B Testing).

My mission is to leverage data and analytics to solve complex and challenging problems, optimize processes and performance, and generate actionable insights and recommendations. I use Python, SQL, GCP, Tableau, and Git as my main tools to develop, deploy, and monitor data models and pipelines. I also collaborate with cross-functional teams and stakeholders to understand their needs, communicate results, and provide data-driven guidance. I am passionate about learning new skills and technologies, and sharing my knowledge and expertise with others.

Session

12-10

15:30

40min

No Cloud? No Problem. Local RAG with Embedding Gemma

Sanjit Paliwal

Running Retrieval-Augmented Generation (RAG) pipelines often feels tied to expensive cloud APIs or large GPU clusters—but it doesn’t have to be. This session explores how Embedding Gemma, Google’s lightweight open embedding model, enables powerful RAG and text classification workflows entirely on a local machine. Using the Sentence Transformers framework with Hugging Face, high-quality embeddings can be generated efficiently for retrieval and classification tasks. Real-world examples involving call transcripts and agent remark classification illustrate how robust results can be achieved without the cloud—or the budget.

Thomas Paul

Sanjit Paliwal .ical

Session

Sanjit Paliwal
.ical