Vaibhav Srivastav

I am a Data Scientist and a Masters Candidate - Computational Linguistics at Universität Stuttgart. I am currently researching on Speech, Language and Vision methods for extracting value out of unstructured data.

In my previous stint with Deloitte Consulting LLP, I worked with Fortune Technology 10 clients to help them make data-driven (profitable) decisions. In my surplus time, I served as a Subject Matter Expert on Google Cloud Platform to help build scalable, resilient and fault-tolerant cloud workflows.

Before this, I have worked with startups across India to build Social Media Analytics Dashboards, Chat-bots, Recommendation Engines, and Forecasting Models.

My core interests lie in Natural Language Processing, Machine Learning/ Statistics and Cloud based Product development.

Apart from work and studies, I love travelling and delivering Workshops/ Talks at conferences and events across APAC and EU, DevConf.CZ, Berlin Buzzwords, DeveloperDays Poland, PyCon APAC (Philippines), Korea, Malaysia, Singapore, India, WWCode Asia Connect, Google DevFest, and Google Cloud Summit.

The speaker's profile picture


Twitter handle


Institute / Company

University of Stuttgart


Introduction to Audio & Speech Recognition
Vaibhav Srivastav

The audio (& speech) domain is going through a massive shift in terms of end-user performances. It is at the same tipping point as NLP was in 2017 before the Transformers revolution took over. We’ve gone from needing a copious amount of data to create Spoken Language Understanding systems to just needing a 10-minute snippet.

This tutorial will help you create strong code-first & scientific foundations in dealing with Audio data and build real-world applications like Automatic Speech Recognition (ASR) Audio Classification, and Speaker Verification using backbone models like Wav2Vec2.0, HuBERT, etc.

HS 120