
Data Science Experimental Methods
Acquire the skills to build effective real-world ML systems: the production ML model lifecycle, data and label quality, experiment tracking, model evaluation, model deployment and monitoring. This course will help you bridge the gap between state-of-the-art ML modeling, and building real-world ML systems.
Over the last decade Machine Learning has become ubiquitous and yet, we are likely only at the beginning of the ML adoption journey. The advent of machine learning has created a profound shift in how software systems are developed and deployed. Instead of developing new algorithms, we are increasingly learning them from large datasets. Moreover, Machine Learning systems are dynamic and require monitoring and frequent re-training to prevent performance degradation.
In this course, you will learn the skills and best practices for building effective real-world ML systems. Some examples include the machine learning model lifecycle, data & label quality, experiment tracking, model evaluation, model deployment and monitoring. The goal of this course is to help bridge the gap between state-of-the-art ML modeling, and building real-world ML systems.
- Archetypes of real-world ML applications
- The production ML lifecycle
- Why data quality and quantity are critical for real-world ML success
- Exploratory data analysis
- Model training & hyperparameter optimization
- Fine-tuning state-of-the-art pretrained transformer models for NLP tasks
- Designing good model evaluation metrics
- Model underfitting and overfitting: what are they, and how to address them
- Behavioral testing for ML models
- Establish bounds on model performance with human annotation baseline
- Behavioral testing for ML models
- Testing for statistical properties of datasets
- Options for deploying models online: common scenarios & tradeoffs
- Feature Stores
- Good practices to ensure production stability: gated rollouts, shadow mode deployment, online experimentation, and easy rollbacks
- Wrap the model and data pipeline in a python FastAPI web service
- Containerize the service using Docker
- Basic integration testing for containerized service
- Deploy the service and test it online
- How MLOps practices evolve as a function of team and company maturity
- Logging and monitoring infrastructure for ML applications
- Data and concept drift in Machine Learning
- CI/CD for ML models
- Statistical data and concept drift measures
- Model performance measurement
- Outlier detection
Nihit has a rare set of skills and experiences - building large-scale ML production systems at top companies, along with a solid and rigorous research background. Along with that, he is great at distilling and passing on his hard-won insights and knowledge. I've learned a lot from his newsletter and the talks he's given to large audiences at Upstart - so I know first-hand how valuable and practical this class will be, and can't think of a better instructor!
Nihit has extensive experience building ML systems for recommendations, ranking and integrity problems at Facebook and LinkedIn. His expertise lies not only in developing and improving deep learning techniques but also in working with large scale systems that scale to billions of users. It’s a combination of both these skill sets that makes him a great fit to teach an MLOps course that requires an in-depth understanding of ML fundamentals and the ability to build out scalable systems that deal with constantly growing and ever-changing datasets in the real-world.
Nihit combines a deep theoretical understanding of ML with hands-on practical knowledge from having built large-scale search, recommender, and decisioning ML systems at the most impactful Internet companies. If I had to learn how to go from an idea to a working, scalable ML system, there would be no better instructor than Nihit!
Software engineers who want to build production systems that integrate ML
Data scientists who want to learn about the production ML lifecycle (aka ‘what comes after model training?’
Students/recent college grads who want hands-on experience building and shipping ML applications
Knowledge of basic machine learning concepts.
Familiarity with software development in Python.
Recommended: Familiarity with Docker, cloud ecosystems such as AWS.