Data Science Research Group

Research

The Data Science Research Group studies methods for gathering large datasets from multiple sources (e.g. sales numbers of products, or web forum posts about politics), extracting information from the data (e.g. forecasting sales numbers, or recognising human intentions), analysing the data in order to find patterns (e.g. establishing people's voting intentions in different areas of the country), and visualising the results of this analysis (e.g. displaying results on a map).

In the area of machine learning, our research aims to develop flexible yet efficient probabilistic learning methods that can take into account diverse statistical features of real-world data. This research is carried out in the SMiLe CLiNiC (Sussex Machine Learning for Computational Linguistics, Network analysis and Computer vision).

In the area of natural language processing, our research focuses on devising accurate, efficient and scalable approaches for computer-based analysis of language, driven by newly emerging application areas which demand language processing that deals with meaning. Areas of current work include:

  • Distributional semantics
  • Social media analysis
  • Clinical text mining
  • Language and communication in autism

For information about our work in these areas, see the research projects pages. We apply this work through consultancy and system development contracts and we consult/collaborate in many of these areas.