Algorithmic Data Science

Module code: 969G5
Level 7 (Masters)
15 credits in autumn semester
Teaching method: Lecture, Laboratory
Assessment modes: Coursework, Unseen examination

This module teaches the computer science aspects of data science. A particular focus is on how data is represented and manipulated to achieve good performance on large data sets (> 10 GBytes) where standard techniques may no longer apply

In lectures, you'll learn about data structures, algorithms, and systems, including distributed computing, databases (relational and non-relational), parallel computing, and cloud computing.

In laboratory sessions, you'll develop your Python programming skills, work with a variety of data sets including large data sets from real world applications, and investigate the impact on run-time of your algorithmic choices.

Module learning outcomes

  • Apply knowledge of standard data structures to the formulation and decomposition of big data.
  • Understand the fundamental issues and challenges of developing parallel distributed algorithms for big data.
  • Evaluate choice of computing model and data representation based on estimation and measurement of impact on space and time complexity and communication performance.
  • Apply appropriate methods to store and retrieve structured big data.