2026/27 Taught Postgraduate Module Catalogue

MATH5706M Further Models for Data Analysis

15 Credits Class Size: 80

Module manager: TBC
Email: TBC

Taught: Semester 2 (Jan to Jun) View Timetable

Year running 2026/27

Pre-requisites

MATH3701 Statistical Modelling
MATH3702 Multivariate Analysis and Classification
MATH5705M Multivariate Data Analysis

This module is not approved as an Elective

Module summary

This module investigates the notions of independence, conditional independence and causality in multivariate datasets. This is motivated by considering ways of handling missing values or unusual observations in real data, and how (unconscious) bias may play a role. Gaussian and log-linear graphical models are studied in detail.

Objectives

This module will equip students with understanding and skills to employ a range of modern statistical techniques that go beyond the complete data sets which satisfy simple assumptions of independent identically distributed data. Graphical models will be used to understand causal structures and inform data analysis. Sources of bias and missing data will be investigated, as will techniques that can sometimes be used to overcome these problems. Throughout, the material will be studies both through theoretical development and application to practical examples.

Learning outcomes

On successful completion of the module students will have demonstrated the following learning outcomes relevant to the subject: 1. Describe various ways in which data can be obtained, including advantages and ethical considerations; 2. Identifying possible sources of bias present in data, and how these can be mitigated; 3. Knowing about missingness mechanisms in data and ways of handling these, including being able to describe and use MICE and EM approaches; 4. Understanding the difference between independence and conditional independence in multivariate normal distributions. 5. Knowing properties of Gaussian and log-linear graphical models; 6. Understanding directed and mixed graphs and causality 7. Use a statistical package with real data to fit these models to data and to write a report giving and interpreting the results.

Skills learning outcomes:
On successful completion of the module students will have demonstrated the following skills :

a) Make a critical assessment of varied data sets

b) Follow a logical approach for selecting and performing an analysis

c) Use IT skills and appropriate digital technology in work and studies 

d) Reflect on statistical findings and draw relevant conclusions in academic and practical contexts

e) Communicate statistical processes and findings effectively

Syllabus

1. Independence. Review of multivariate normal, multiple linear regression, logistic regression, maximum likelihood; 2. Acquiring data. Surveys, samples, design of experiments, observational studies, web scraping. Ethics; 3. Causes, identification and mitigation of bias in datasets; 4. Types of missingness, methods for handling missing data. MICE and EM algorithm for missing data; 5. Gaussian graphical models; 6. Log-linear graphical models 7. Directed and mixed graphs. Causality. Additional topics that build on these may be covered as time allows. Further details of possible topics will be delivered closer to the time that the module runs

Teaching Methods

Delivery type Number Length hours Student hours
Lectures 33 1 33
Practicals 1 2 2
Private study hours 115
Total Contact hours 35
Total hours (100hr per 10 credits) 150

Opportunities for Formative Feedback

Formative feedback will be provided on regular example sets or other similar learning activity.

Reading List

Check the module area in Minerva for your reading list

Last updated: 30/04/2026

Errors, omissions, failed links etc should be notified to the Catalogue Team