Below is a list of downloadable language processing systems and data that have been produced wholly or in part by members of the group.
- Efficiency in large-scale parsing systems - evaluation methodology and data
- English parser evaluation corpus
- LKB system
- Morphological and orthographic tools for English
- PolyLex lexicons
- RASP system
- Simple Good-Turing probability estimation
- SUSANNE, CHRISTINE and LUCY corpora
- Zagibalov and Carroll sentiment classification corpora
- Chinese reviews of mobile phones - it168test
See Zagibalov, T. and J. Carroll (2008) Unsupervised classification of sentiment and objectivity in Chinese text. In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP), Hyderabad, India.
- Chinese reviews for 10 product types (incorporates it168test) - dataZH
See Zagibalov, T. and J. Carroll (2008) Automatic seed word selection for unsupervised sentiment classification of Chinese text. In Proceedings of The 22nd International Conference on Computational Linguistics (COLING), Manchester, UK.
- English and Russian comparable corpora - Russian book reviews, English book reviews
See Zagibalov, T., K. Belyatskaya and J. Carroll (2010) Comparable English-Russian book review corpora for sentiment analysis. In Proceedings of the 1st Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA), Lisbon, Portugal.