Computing

Algorithmic Data Science

Module code: 969G5
Level 7 (Masters)
15 credits in autumn teaching
Teaching method: Lecture, Laboratory
Assessment modes: Unseen examination, Coursework

In this module, you will learn the computer science aspects of data science.

You will particularly focus on how data are represented and manipulated to achieve good performance on large data sets (larger than 10GB) where standard algorithmic techniques no longer apply.

In lectures, you will learn about data structures, algorithms and systems, including distributed computing, databases (relational and non-relational), parallel computing, and cloud computing.

In laboratory sessions, you will work with large data sets from real world applications. This will help you to understand the impact on run-time of your algorithmic choices, and of different computing models (GPU vs CPU).

Module learning outcomes

  • Apply knowledge of standard data structures to the formulation and decomposition of big data.
  • Understand the fundamental issues and challenges of developing parallel distributed algorithms for big data.
  • Evaluate choice of computing model and data representation based on estimation and measurement of impact on space and time complexity and communication performance.
  • Apply appropriate methods to store and retrieve structured big data.