2024/25 Undergraduate Module Catalogue

CHEM3212 Big Data, Big Science

10 Credits Class Size: 70

Module manager: Dr Stuart Warriner
Email: s.l.warriner@leeds.ac.uk

Taught: Semester 1 (Sep to Jan) View Timetable

Year running 2024/25

This module is not approved as a discovery module

Module summary

The explosion of information means that many jobs often require people to handle large datasets efficiently and quickly, yet graduates often don’t have these core skills. In science new insights often involve taking lots of data and bringing it together in a way that illuminates the problem. In this course you will develop the core skills to efficiently handle large datasets. Using examples from across Chemistry you will see how to efficiently extract data using simple programming in python and reach meaningful conclusions. Online tools will help you acquire key skills while weekly seminars will let you explore real examples, enabling you use these skills to answer scientific questions.

Objectives

To enable students to explore how to handle large datasets to extract key scientific information.

Learning outcomes

Understand how large datasets can be useful within and outside science.
The ability to use simple python programming to extract data from large datasets.
Presentation of data efficiently and concisely.

Syllabus

Aggregating multisheet data using indirect functions
Fundamental python programming concepts
Pattern matching and data mining using python

Teaching Methods

Delivery type Number Length hours Student hours
On-line Learning 18 1 18
Workshop 6 2 12
Independent online learning hours 30
Private study hours 70
Total Contact hours 30
Total hours (100hr per 10 credits) 130

Private study

Lectures are to introduce the course and the assessment only
Online courses and examples to enable development of the technical skills for data analysis – eg basics of python programming.
Self study with self-taught examples and tests. These tools will then support the exercises in the workshops

Opportunities for Formative Feedback

The workshop sessions will involve guided solutions to the project with a member of staff enabling feedback on the approach being taken and any technical issues.

The online learning will have self help exercises to enable the students to monitor their own progress.

Methods of Assessment

Coursework
Assessment type Notes % of formal assessment
Project Programming Project 100
Total percentage (Assessment Coursework) 100

The project will include a real data analysis exercise framed around a scientific question. The students will have to understand the data provided, determine what data is relevant and extract it using the skills they have obtained and then present the data as a short report. Students requiring to resit the module would be given a further attempt to complete the project over the summer.

Reading List

There is no reading list for this module

Last updated: 4/29/2024

Errors, omissions, failed links etc should be notified to the Catalogue Team