2024/25 Taught Postgraduate Module Catalogue

MODL5007M Corpus Linguistics for Translators

15 Credits Class Size: 30

Module manager: Professor Serge Sharoff
Email: S.Sharoff@leeds.ac.uk

Taught: Semester 1 (Sep to Jan) View Timetable

Year running 2024/25

This module is not approved as an Elective

Module summary

This module is aimed at studying how language is used from the perspective of data science. The basis for the study is provided by corpora, ie large databanks of texts in natural languages. Corpora are also commonly used to train Generative AI and Machine Translation. The aim of the module is to make students familiar with statistical methods for corpus exploration and to equip them with AI literacy skills.

Objectives

The overall goal of the module is to introduce data science for language with the specific focus on translation. The more specific aims are:
- to introduce the basic concepts and methods of data science and how they can be applied to translation;
- to provide practical skills and tools for querying corpora and statistical interpretation of the results;
- to make students better equipped with the background required for interaction with Generative AI and Machine Translation tools.

Learning outcomes

On successful completion of the module students will have demonstrated the following learning outcomes relevant to the subject:
LO1 describe basic types of corpora
LO2 understand principles of corpus querying
LO3 know relevant statistical methods
LO4 compare word uses in the source and target languages using parallel and comparable corpora
LO5 design Python scripts to collect and process their own specialised corpora.

Skills Learning Outcomes

On successful completion of the module students will have demonstrated the following skills learning outcomes:
SO1. Academic: reflection and critical thinking to understand the nature of language use from a statistical point of view.
SO2. Digital: proficiency in using software to make queries; ability to produce and interpret statistics, understanding the principles of AI.
SO3. Work-Ready: using software; creativity in problem solving; programming and coding ability.
SO4. Technical: understanding the principles of corpus databanks and the principles of IT tools more generally.
SO5. Sustainability: searching for information, skills in identifying potential challenges.
SO6. Enterprise: ability to identify and assess opportunities, applying digital and inter-disciplinary literacies.

Syllabus

Details of the syllabus will be provided on the Minerva organisation (or equivalent) for the module

Teaching Methods

Delivery type Number Length hours Student hours
Lecture 8 1 8
Practical 4 1 4
Seminar 8 1 8
Private study hours 130
Total Contact hours 20
Total hours (100hr per 10 credits) 150

Opportunities for Formative Feedback

Regular weekly feedback on the progress with the case study

Methods of Assessment

Coursework
Assessment type Notes % of formal assessment
Assignment Case study 100
Total percentage (Assessment Coursework) 100

Normally resits will be assessed by the same methodology as the first attempt, unless otherwise stated

Reading List

The reading list is available from the Library website

Last updated: 5/24/2024

Errors, omissions, failed links etc should be notified to the Catalogue Team