Interpreting Machine Learning Models
This course provides an introduction to interpreting machine learning models, specifically deep learning models that are pre-trained on large amounts of data using self-supervision. The course is designed to cover the breadth of different techniques for explaining predictions of classifiers trained on language, vision, and tabular data. You will get hands-on experience working with different algorithms for interpretability and benchmark widely used ML models for faithfulness and plausibility. The course will also cover the basics of how some explanations can help uncover artifacts and use those observations to build robust ML models.
Course taught by expert instructors

Nazneen Rajani
Robustness Research Lead at Hugging Face
Nazneen is a Robustness Research Lead at Hugging Face She got her Ph.D. in Computer Science from UT Austin where she was advised by Pof. Ray Mooney. Several of her works (15+) have been published in top-tier AI conferences including ACL, EMNLP, NAACL, NeurIPS, ICLR. Nazneen was one of the finalists for the VentureBeat Transform 2020 women in AI Research. Her work has been covered by various media outlets including Quanta magazine, VentureBeat, SiliconAngle, ZDNet, Datanami. More details about her work can be found on her website https://www.nazneenrajani.com/
The course
Learn and apply skills with real-world projects.
Data scientists and ML practitioners with a background in Machine Learning who want to better understand deep learning models and interpret their predictions.
Domain experts in ethics, law, policy, and other regulators who would like to get a deeper understanding of how ML models work.
Ability to write Python and work with documented libraries (Scikit-learn, Captum, Numpy, Transformers)
Completed a course in foundational machine learning (CoRise Deep Learning Essentials or similar)
Foundational knowledge of statistics
Try these prep courses first
- Learn
- What is interpretable ML?
- Why explanations?
- Types of explanations
- Methods for interpreting ML models
- Saliency based methods
- Integrated gradients, SHAP, GradCam, LIME
Generate and compare explanations for predictions of a image classifier- Train an image classifier of your choice (eg: food classifier)
- Implement saliency based methods for explaining model predictions
- Compare the methods – which one do you agree with most?
- Learn
- Text explanations – leave-one-out, rationale
- Open-ended explanations using pre-trained language models
- Influence functions
Interpret the predictions of a pre-trained large language model- Fine-tune a language model on a task (eg: sentiment)
- Study model behavior using leave-one-out and rationales
- Generate explanations via prompts using BLOOM/GPT3
- Implement influence functions for interpreting predictions
- Use cases for different methods
- Learn
- Hard vs. soft rationales
- Faithfulness vs. plausibility
- Comprehensiveness and sufficiency metrics
Comprehensive evaluation of explanations- Generate rationales for a text classification model
- Evaluate and compare using the ERASER or Ferret benchmarks
- Analyze performance on different metrics
- What can we infer about the methods based on these evaluations? Are some methods better suited to certain use cases?
- Learn
- Using explanations to 1) examine plausible spurious patterns and 2) uncover dataset artifacts
- Data curation via counterfactuals for model training
- Evaluate model for robustness
Applying lessons learned from interpreting models to make them robust- Generate counterfactual explanations for model predictions on tabular data OR
- In-depth robustness evaluation using counterfactuals, subpopulations, and adversarial attacks OR
- Robustify model via weak supervision instead of counterfactuals
Real-world projects
Work on projects that bring your learning to life.
Made to be directly applicable in your work.
Live access to experts
Sessions and Q&As with our expert instructors, along with real-world projects.
Network & community
Core reviews a study groups. Share experiences and learn alongside a global network of professionals.
Support & accountability
We have a system in place to make sure you complete the course, and to help nudge you along the way.
Get reimbursed by your company
More than half of learners get their Courses and Memberships reimbursed by their company.
Hundreds of companies have dedicated L&D and education budgets that have covered the costs.