2024/25 Undergraduate Module Catalogue

COMP2121 Data Mining

10 Credits Class Size: 500

Module manager: Prof Eric Atwell
Email: e.s.atwell@leeds.ac.uk

Taught: Semester 2 (Jan to Jun) View Timetable

Year running 2024/25

Pre-requisite qualifications

COMP1012 Introduction to Programming or COMP1121 Databases and Programming Experience

This module is not approved as a discovery module

Module summary

This module explores the data mining process and its application in different domains such as text and web mining. You will learn the principles of data mining; compare a range of different techniques, algorithms and tools and learn how to evaluate their performance.

Objectives

On completion of this module, students should be able to:
-Identify all of the data, information, and knowledge elements, for a computational science application.
-understand the components of the knowledge discovery process
-understand and use algorithms, resources and techniques for implementing data mining systems;
-understand techniques for evaluating different methodologies
-demonstrate familiarity with some of the main application areas;
-demonstrate familiarity with data mining and text analytics tools.

Learning outcomes

On completion of this module, students should be able to

- understand the data mining process and its application in different domains such as text and web mining;
- understand the principles of data mining;
- compare a range of different techniques, algorithms and tools and evaluate their performance.
-demonstrate familiarity with some of the main application areas;
-demonstrate familiarity with data mining and text analytics tools.

Syllabus

Introduction to data mining terminology and components of the data mining process, text analytics, and SketchEngine; tools and techniques for data collection and data cleansing, use of machine learning classifiers for data classification, open-source and commercial data mining and text analytics resources and toolkits, CRISP-DM and WEKA; word meanings, text tagging, and scaling to big data; use of clustering and association tools for data mining, chatbots for university education; Machine Translation, Information Extraction, and Python tools for text analytics; web-based text analytics; case studies of current research and commercial applications in data mining and text analytics, BERT.

Teaching Methods

Delivery type Number Length hours Student hours
Class tests, exams and assessment 2 2 4
Lecture 8 1 8
Private study hours 76
Total Contact hours 12
Total hours (100hr per 10 credits) 88

Opportunities for Formative Feedback

Coursework and labs.

Methods of Assessment

Coursework
Assessment type Notes % of formal assessment
In-course Assessment Report 60
In-course Assessment Test 1 20
In-course Assessment Test 2 20
Total percentage (Assessment Coursework) 100

Resits will be assessed by coursework.

Reading List

The reading list is available from the Library website

Last updated: 9/25/2024

Errors, omissions, failed links etc should be notified to the Catalogue Team