Module manager: TBC
Email: TBC
Taught: Semester 2 (Jan to Jun) View Timetable
Year running 2026/27
| MATH3701 | Statistical Modelling |
| MATH3702 | Multivariate Analysis and Classification |
| MATH5705M | Multivariate Data Analysis |
This module is not approved as an Elective
This module investigates the notions of independence, conditional independence and causality in multivariate datasets. This is motivated by considering ways of handling missing values or unusual observations in real data, and how (unconscious) bias may play a role. Gaussian and log-linear graphical models are studied in detail.
This module will equip students with understanding and skills to employ a range of modern statistical techniques that go beyond the complete data sets which satisfy simple assumptions of independent identically distributed data. Graphical models will be used to understand causal structures and inform data analysis. Sources of bias and missing data will be investigated, as will techniques that can sometimes be used to overcome these problems. Throughout, the material will be studies both through theoretical development and application to practical examples.
On successful completion of the module students will have demonstrated the following learning outcomes relevant to the subject: 1. Describe various ways in which data can be obtained, including advantages and ethical considerations; 2. Identifying possible sources of bias present in data, and how these can be mitigated; 3. Knowing about missingness mechanisms in data and ways of handling these, including being able to describe and use MICE and EM approaches; 4. Understanding the difference between independence and conditional independence in multivariate normal distributions. 5. Knowing properties of Gaussian and log-linear graphical models; 6. Understanding directed and mixed graphs and causality 7. Use a statistical package with real data to fit these models to data and to write a report giving and interpreting the results.
Skills learning outcomes:
On successful completion of the module students will have demonstrated the following skills :
a) Make a critical assessment of varied data sets
b) Follow a logical approach for selecting and performing an analysis
c) Use IT skills and appropriate digital technology in work and studies
d) Reflect on statistical findings and draw relevant conclusions in academic and practical contexts
e) Communicate statistical processes and findings effectively
1. Independence. Review of multivariate normal, multiple linear regression, logistic regression, maximum likelihood; 2. Acquiring data. Surveys, samples, design of experiments, observational studies, web scraping. Ethics; 3. Causes, identification and mitigation of bias in datasets; 4. Types of missingness, methods for handling missing data. MICE and EM algorithm for missing data; 5. Gaussian graphical models; 6. Log-linear graphical models 7. Directed and mixed graphs. Causality. Additional topics that build on these may be covered as time allows. Further details of possible topics will be delivered closer to the time that the module runs
| Delivery type | Number | Length hours | Student hours |
|---|---|---|---|
| Lectures | 33 | 1 | 33 |
| Practicals | 1 | 2 | 2 |
| Private study hours | 115 | ||
| Total Contact hours | 35 | ||
| Total hours (100hr per 10 credits) | 150 | ||
Formative feedback will be provided on regular example sets or other similar learning activity.
Check the module area in Minerva for your reading list
Last updated: 30/04/2026
Errors, omissions, failed links etc should be notified to the Catalogue Team