Causal Inference for Data Science
This course teaches you how to answer the most fundamental question in data science: Does X cause Y? We’ve designed the course to cover the essential causal inference techniques while applying them to relevant, real-world examples. By the end of this course, you will know how to identify causal inference problems, pick the right tool for the job, and understand how these tools can break. You will be well-equipped to get maximum value from observational data sets and have the foundation to expand your causal inference toolkit with cutting-edge methods.
Course taught by expert instructors
Head of Data Science at Solana Foundation
Dan leads Data Science at the Solana Foundation, where he guides the development of everything from data foundations to causal inference and predictive analytics. Before Solana, Dan worked at Coursera for eight years, building the Growth and Decision Science teams. Both helped guide company strategy while assisting stakeholders in making data-informed decisions. Dan holds degrees in Math and Economics from U.C.L.A. and began his career in finance, where he analyzed the debt securities of risky companies while wishing he’d chosen to do something less soul-crushing.
In addition to his professional experience, Dan is an avid writer. His original data journalism has appeared in The Economist, and he publishes a blog and Substack on technical and management topics.
Learn and apply skills with real-world projects.
- ProjectThroughout this course, we will work with a synthetic data set of a retailer that sells goods online and in physical stores. This week, we will get familiar with the data set and begin to answer our central causal question: Does convincing a customer to shop in person increase their lifetime value?Learn
- How to use linear regression for causal inference
- Ways that linear regression can go wrong
- The importance of simple models and data visualizations when investigating causal relationships
- ProjectWe will apply IV analysis to our causal question of interest to see how it enables richer inference than regression alone. Along the way, we will become more comfortable thinking through the assumptions of both IV and RDD.Learn
- How to use instrumental variable analysis and regression discontinuity designs for causal inference
- The assumptions underlying each method and how to interrogate them
- How IV and RDD interrelate and enrich your causal inference toolkit
- ProjectWe will enrich our analysis further by re-conceiving our data set as a panel and applying the techniques we learned this week. We will then put our analyses together in a final, executive-friendly report or presentation summarizing our findings.Learn
- How to use difference-in-differences to leverage time for causal inference
- How to think of panel data analysis as a generalization of DD
- The importance of adjusting your standard errors when time is a variable
- ProjectYou now have the complete causal inference toolkit! There is no limit to the questions you can answer (well, depending on available data). We will spend this week wrapping up our report. For those of us who finished early, we have the optional task of finding a data set in the wild and applying our favorite technique to answering a question we’re passionate about.Learn
- How machine learning is changing the field of causal inference
- High-level summaries of advanced causal inference techniques and when they might be helpful
- Where to go from here
Work on projects that bring your learning to life.
Made to be directly applicable in your work.
Live access to experts
Sessions and Q&As with our expert instructors, along with real-world projects.
Network & community
Core reviews a study groups. Share experiences and learn alongside a global network of professionals.
Support & accountability
We have a system in place to make sure you complete the course, and to help nudge you along the way.
This course is for...
Data practitioners who want to increase their impact by getting more value from observational data and rigorously analyzing complex business questions.
Data analysts who want to expand their analytics toolkit and transition into more methodologically demanding roles.
Ability to use Python for data munging, visualization, and basic statistical inference
Knowledge of statistical inference at the 101 level (e.g., at the level of CoRise Applied Statistics for Data Science)
Some familiarity with A/B testing and linear regression