How many times have you developed a model or a data application and tested it locally or in a staging environment just to find out that it breaks in production? This is a common issue faced by thousands of data scientists around the globe. As the wor
I will provide a number of machine learning examples and use cases focusing on logging, debugging, diagnosis, automated testing, integration and delivery. In brief, this talk will lead you to step by step on how to use Azure Devops to automate the deployment of your data applications along with Kubernetes and Docker containers to a production environment.
Attendees will acquire an understanding of DataOps and how these can improve your data science workflows. As we move over to the examples, you will better identify the many challenges faced during the productionization of data applications and how these can be mitigated through best DataOps practices. By the end of the talk, attendees will have the knowledge required to automate the delivery of their data products, increasing their productivity and the quality of their work.
Outline:
Introduction: what are DataOps and why should Data Scientists care about?
Introduction to the technologies used: Docker, Kubernetes, CI/CD, helm, etc. : term debunking, why are these technologies and what is the fuzz about them?
Preparing your repository for continuous delivery
Provision your resources in the cloud efficiently using Helm and Kubernetes
Setting a basic deployment pipeline
Putting it all together
Adding additional features (e.g. intermediate checks, sandboxing etc.) to your pipeline so that it is tailored to your needs
Algorithms, Big Data, Data Science, DevOps, Machine Learning
Domain Expertise:some
Python Skill Level:basic
Abstract as a tweet:Devops for the busy data scientist: learn how to leverage these practices to improve your workflows