I am a Royal Academy of Engineering Enterprise Fellow and a Borysiewicz Fellow at the University of Cambridge — Language Technology Lab (LTL). My current research interests include natural language processing, information extraction, and text mining applications.
PhD in Computer Science
University of Cambridge
MPhil (Distinction) in Advanced Computer Science
University of Cambridge
MSc (First Class Honours) in Computer Science
University of Auckland
BE (First Class Honours) in Software Engineering
University of Auckland
A corpus of 3661 PubMed abstracts, manually annotated by experts according to a taxonomy describing how toxic chemicals enter the body, and how they can be monitored.
A corpus of 1852 PubMed abstracts, manually annotated by experts according to a taxonomy describing how cancer starts and spreads in the body.
A dataset of 10 large graphs representing co-occurrence of concepts in PubMed abstract sentences. It can be used to evaluate the performance of LBD systems using real-world scientific discoveries by applying ‘time travelling’.
A large dataset of semantic similarity scores for 1888 word pairs in 13 languages, as well as derived cross-lingual scores
143 pairs of verbs annotated for semantic similarity by 10 annotators.
A corpus of 7,803 sentences annotated with 33,524 relations assigning types to variables appearing in mathematical text.
I currently supervise an introductory undergraduate course in Computational Lingustics.
In the past I co-lectured a course on Biomedical Information Processing at the Department of Computer Science and Technology, as well as supervised undergraduate students in several courses including: Object-Oriented Programming (year 1), Further Java (year 2), and Software Engineering (year 2).
I also supervise postgraduate research projects (mainly MPhil projects). If you are looking for a research project and are based (or about to start) in Cambridge, feel free to contact me.