Module manager: Dr Arash Rabbani
Email: A.Rabbani@leeds.ac.uk
Taught: 1 Mar to 30 Apr, 1 Mar to 30 Apr (2mth)(adv yr), 1 Sep to 31 Oct, 1 Sep to 31 Oct (adv yr) View Timetable
Year running 2026/27
None
N/A
This module is not approved as an Elective
This module explores deep learning techniques for computer vision, focusing on how neural networks interpret, represent, and generate visual information. It examines architectures and algorithms that enable tasks such as image classification, object detection, segmentation, and visual synthesis. Students learn how advances in convolutional and transformer-based models underpin modern vision systems and gain practical experience in developing, training, and evaluating models that extract meaningful structure and semantics from visual data.
This module aims to develop a deep understanding of how modern deep learning methods enable machines to perceive and reason about visual information. It builds on core principles of representation learning to explore architectures and techniques that power computer vision systems, including convolutional, transformer-based, generative, and multimodal models that integrate visual data with other information sources. Learning activities combine conceptual explanation, visual demonstrations, guided experiments, and practical implementation exercises to bridge theory and practice. Through progressive exploration of image classification, detection, segmentation, and synthesis, students gain the knowledge and skills to design, train, and critically evaluate vision models for diverse applications, while developing an appreciation of how architectural choices influence visual representation and performance.
On successful completion of the module students will have demonstrated the following learning outcomes relevant to the subject:
1. Explain the principles of deep learning architectures used in computer vision, including convolutional, transformer-based, and generative models.
2. Apply deep learning methods to core computer vision tasks such as image classification, object detection, segmentation, and visual synthesis.
3. Design and implement vision models, selecting suitable architectures, training strategies, and evaluation metrics for different visual applications.
4. Assess how data characteristics, inductive biases, and architectural choices influence the performance and generalisation of vision models.
5. Experiment with multimodal approaches that combine visual data with other modalities to enhance representation and understanding in complex tasks.
On successful completion of the module students will have demonstrated the following skills learning outcomes:
1. Apply analytical and structured problem-solving skills to design, implement, and evaluate computer vision solutions for complex datasets.
2. Demonstrate adaptability and self-directed learning by integrating new tools, techniques, or frameworks to address evolving challenges in computer vision.
3. Communicate technical concepts, workflows, and results effectively to both technical and non-technical audiences using clear documentation and visualisation.
4. Apply integrated problem-solving and systems thinking to design and optimise computer vision solutions.
5. Exercise reflective practice and critical evaluation to assess methods, optimise processes, and continuously improve project outcomes.
Indicative content for this module includes:
- Fundamentals of deep learning for computer vision and visual representation learning
- Convolutional neural networks and architectural variants for image classification and feature extraction
- Object detection, localisation, and semantic/instance segmentation methods
- Transformer-based models and attention mechanisms for visual understanding
- Generative and reconstruction-based approaches, including autoencoders, GANs, and diffusion models for image synthesis
- Multimodal vision-language models such as CLIP and related architectures for cross-modal representation learning
| Delivery type | Number | Length hours | Student hours |
|---|---|---|---|
| Discussion forum | 6 | 1 | 6 |
| WEBINAR | 6 | 1 | 6 |
| Independent online learning hours | 42 | ||
| Private study hours | 96 | ||
| Total Contact hours | 12 | ||
| Total hours (100hr per 10 credits) | 150 | ||
1. Webinar-Based Discussion and Q&A
2. Weekly Practical Exercises
| Assessment type | Notes | % of formal assessment |
|---|---|---|
| Online Assessment | ~20 questions about different scenarios | 20 |
| Coursework | Coursework Project - Technical Report | 80 |
| Total percentage (Assessment Coursework) | 100 | |
This module will be reassessed through a 100% individual assessment in the same format as Assessment 2 (coursework project). The reassessment will involve a practical project that requires students to apply and integrate the knowledge and skills developed across all learning outcomes.
Check the module area in Minerva for your reading list
Last updated: 30/04/2026
Errors, omissions, failed links etc should be notified to the Catalogue Team