Data Science

Welcome to the Data Science Research Group

The Data Science Research Group addresses the challenges associated with deriving knowledge from large heterogeneous datasets - with a particular focus on machine learning and natural language processing.

The group (based in the Department of Informatics) studies methods for gathering large datasets, extracting information from the data, analysing the data in order to find patterns, and visualising the results of this analysis.

In the area of machine learning, we are developing learning methods addressing dynamic evolution of data over long time periods, inconsistencies of information between or within data sources, and integration of domain-specific knowledge with data. We are applying these methods to image analytics, video analytics and time series analysis.

In the area of natural language processing, we are developing techniques for understanding people’s social media activity, investigating healthcare issues by analysing electronic patient records, acquiring information about word meaning from raw text, and searching and classifying large collections of documents. Also in this area, we are devising computational models to study the ways in which people, particularly people with autism, use language to communicate.

For further details about our activities, please browse the research and projects pages.

Below is a sample of recent research highlights:

Dr Viktoriia Sharmanska (Data Science associate) has been awarded a prestigious Imperial College Research Fellowship. This scheme gives outstanding early-career researchers the opportunity to establish and develop their own research path. Each year only 20 Fellows are selected across all subjects. Dr Sharmanska's 3-year project is on deep understanding of human behaviour from video data.

Dr Novi Quadrianto (Data Science lecturer) is a co-founder of the International Laboratory of Deep Learning and Bayesian Methods at the National Research University Higher School of Economics (HSE), Moscow. An 'international laboratory' is a special structure within HSE in which leading foreign scholars co-direct a lab with Russian scholars. This new 3-year international lab will combine the strengths of Dr Quadrianto’s group and Prof Dmitry Vetrov’s group in innovative basic research into composing Bayesian methods and deep learning models, driven by real-world applications.

Thomas Kober (Data Science PhD student) is lead author of a paper on data-intensive semantics that has been nominated for best paper award at a workshop at EACL 2017: