Module manager: Dr Abdulrahman Altahhan
Email: A.Altahhan@leeds.ac.uk
Taught: 1 Jan to 28 Feb, 1 Jan to 28 Feb (adv year), 1 Jul to 31 Aug View Timetable
Year running 2026/27
None
N/A
This module is not approved as an Elective
This module introduces the principles and methods of reinforcement learning, focusing on how intelligent agents learn to make sequences of decisions through interaction with their environment. It examines how experience, feedback, and exploration guide learning and adaptation over time. Students develop an understanding of how reinforcement learning supports autonomous behaviour and gain practical experience in designing and training agents capable of acting, improving, and generalising across a range of dynamic tasks.
This module aims to develop both conceptual understanding and practical skills in reinforcement learning as a framework for sequential decision making. It explores how agents learn from experience, balance exploration and exploitation, and adapt their behaviour based on feedback from the environment. Students examine how reinforcement learning drives advances in areas such as game-playing artificial intelligence, robotics, and the alignment of large language models, where feedback-based learning shapes intelligent behaviour. The module equips students with the understanding and practical ability to design, train, and analyse adaptive systems that learn from interaction, while offering a broader perspective on how reinforcement learning provides insight into the mechanisms underlying intelligent behaviour. Learning activities combine explanatory material, visual demonstrations, guided exercises, and hands-on experimentation using reinforcement learning frameworks to build intuition and technical fluency.
On successful completion of the module students will have demonstrated the following learning outcomes relevant to the subject:
1. Apply the principles of reinforcement learning and explain how agents learn through interaction and feedback.
2. Apply reinforcement learning algorithms to sequential decision-making problems across simulated or real-world environments.
3. Design and implement learning agents that balance exploration and exploitation to improve performance over time.
4. Assess how reward structures, feedback signals, and environmental dynamics influence learning outcomes and agent behaviour.
5. Discuss how reinforcement learning contributes to a broader understanding of adaptive and intelligent systems in both artificial and natural contexts.
On successful completion of the module students will have demonstrated the following skills learning outcomes:
1. Apply critical thinking and structured problem-solving to design, implement, and evaluate adaptive learning algorithms in dynamic environments.
2. Demonstrate adaptability and self-directed learning by exploring and integrating new methodologies, tools, or paradigms independently.
3. Communicate complex technical concepts and experimental results effectively to both technical and non-technical audiences using clear documentation and visualisation.
4. Apply integrated problem-solving and systems thinking to design and evaluate adaptive learning systems.
5. Exercise reflective practice and iterative improvement, evaluating approaches, interpreting outcomes, and refining strategies over time.
Indicative content for this module includes:
- Foundations of reinforcement learning and sequential decision-making
- Core theoretical constructs: agents, environments, states, actions, rewards, and returns
- Markov decision processes, value functions, and the Bellman equation
- Model-free prediction and control through Monte Carlo and temporal-difference methods
- Multi-step learning and eligibility traces for improving sample efficiency
- Function approximation and generalisation using parametric and neural models
- Deep reinforcement learning methods for continuous and high-dimensional tasks
- Exploration-exploitation trade-offs, stability, and performance evaluation in learning agents
| Delivery type | Number | Length hours | Student hours |
|---|---|---|---|
| Discussion forum | 6 | 1 | 6 |
| WEBINAR | 6 | 1 | 6 |
| Independent online learning hours | 42 | ||
| Private study hours | 96 | ||
| Total Contact hours | 12 | ||
| Total hours (100hr per 10 credits) | 150 | ||
1. Webinar-Based Discussion and Q&A
2. Weekly Practical Exercises
| Assessment type | Notes | % of formal assessment |
|---|---|---|
| Online Assessment | ~20 questions about different scenarios | 20 |
| Coursework | Coursework Project - Technical Report | 80 |
| Total percentage (Assessment Coursework) | 100 | |
This module will be reassessed through a 100% individual assessment in the same format as Assessment 2 (coursework project). The reassessment will involve a practical project that requires students to apply and integrate the knowledge and skills developed across all learning outcomes.
Check the module area in Minerva for your reading list
Last updated: 30/04/2026
Errors, omissions, failed links etc should be notified to the Catalogue Team