Applications

Event (& Eyewitness) Detection in Online Social Media

This work centres on the automatic, real-time detection of interesting events in short text posts on social media, primarily focussing on Twitter. Events are a very broad concept – this research focusses specifically on detecting impactful events likely to have significant effect on communities. Examples include both anthropic events such as terrorist attacks, stabbings, and riots, as well as natural disasters such as earthquakes and wildfires. Research aims include: 1) the establishment and improvement of state of the art techniques for automatically detecting an events' occurrence; and 2) identifying within the set of users discussing an event, those few who are eyewitnesses of the event and who are co-located with the its geographic location. Such capabilities are of great value, particularly to crisis responders during emergency events. It enables the filtering of vast amounts of social media data noise, to directly access first hand reporters likely to possess the greatest insight and most timely information. This in turn can inform emergency response, and facilitate more efficient and targeted direction and utilisation of support services. Additionally, the work is of notable benefit to journalists and policy makers, both of whom are afforded a great competitive advantage by low-latency awareness of new events' occurrences, and direction to primary witnesses thereof, whom they can then contact directly.

People: Justin Crow, David Weir and Jeremy Reffin

Exploiting Enriched Representations for Arabic NLP by embedding more and ignoring less

This research concerns enriching representations of Arabic by embedding linguistic characteristics such as diacritics. The underlying strategy is to exploit linguistic feature of Arabic as much as possible across a variety of Arabic NLP tasks and across different NLP architectures.

People: Ahmed Younes, Julie Weeds and David Weir

Cross-lingual Information Retrieval

The Cross-Lingual Information Retrieval research topic lies at the intersection of the fields of Machine Translation and Information Retrieval. This task involves retrieving documents in some target language, given a query in a different (source) language. Our approach involves the use contextualising source language documents, and the use of of light-weight computation resources.

People: Justina Li, Julie Weeds and David Weir

Accurate and Practical Demographic Inference

The research is concerned with the development of NLP methods applied to noisy/user-generated text, such as that found on social media and online forums. This encompasses a variety of tasks covering the prediction of document- and author-attributes. So far, this has included the inference of the age, location, and interests of author, as well as the geographical-focus and topic-relevancy of the document as a whole.

People: Chris Inskip, David Weir and Jeremy Reffin