Quantitative social sciences: from correlation to causality

Offer semester
2nd semester

Lecture time
Wednesday 9:30 - 12:20

Lecture venue

Course description

Many if not most social research questions are concerned with questions of causality, e.g. what are the causes of good and bad things in society? Only if we understand the causes can we hope to modify the good/bad effects. Much if not most of social research is observational, i.e. correlational; we can observe and measure things, ask people questions etc., but it’s not easy to run experiments. This means that often we only have correlational data with which to evaluate and test our causal research questions. Taken together, the two conditions above present a problem, because as we all know, correlation does not equal causation.

Recently developed theories of causation challenge these limitations. We will use the theory of Directed Acyclic Graphs (DAGs) to understand how causality translates into correlations among variables. We will use this knowledge to help us specify statistical models that may help us evaluate our causal theories.

The statistical models we will use are varieties of Generalized Linear Models (GLMs), specifically Linear Regression and Logistic Regression. We will also be looking at simple extensions to these models that allow us to deal with so-called multilevel data, which has a nested structure, e.g. pupils nested in schools. We will use the R software package to estimate these models using data. We will evaluate some existing social research studies using our knowledge of DAGs and GLMs.

No prior knowledge about statistical modelling is needed for this course.

Course learning outcomes

By the end of the course, students will be able to:

  1. Develop a statistical model to answer a social science research question using observational data.
  2. Assess common issues with making causal inferences from observational data.
  3. Critique some social science research that use correlations to infer causality.
  4. Analyse a dataset using R and integrate the results into a report.
  5. Demonstrate how to interpret the results of their quantitative data analysis.


Participation in Quiz5%
Critical Appraisal of Research Study20%
Student presentation20%
Participation in student feedback on presentations5%
Data analysis report50%

Required reading

STATISTICAL METHODS FOR THE SOCIAL SCIENCES. Fourth Edition. Alan Agresti and Barbara Finlay. University of Florida. (2008) (key reading)

Basic statistics for the behavioral and social sciences using R. Wendy Zeitlin and Charles Auerbach (2019): available online

Thinking Clearly About Correlations and Causation: Graphical Causal Models for Observational Data. Rohrer, Julia M. Advances in methods and practices in psychological science (2018) Volume:1 Issue:1. 27-42 DOI: 10.1177/2515245917745629  available online

Graphical Causal Models. Felix Elwert in Handbook of Causal Analysis for Social Research. (2013)  DOI: 10.1007/978-94-007-6094-3_13 available online

Understanding regression analysis: an introductory guide. Larry D. Schroeder et al. (2017) available online

Logistic regression: a primer. Fred C. Pampel. (2021) available online

Course co-ordinator and teachers