Enhancing biomedical word embeddings by retrofitting to verb clusters

Abstract

Verbs play a fundamental role in many biomedical tasks and applications such as relation and event extraction. We hypothesize that performance on many downstream tasks can be improved by aligning the input pretrained embeddings according to semantic verb classes. In this work, we show that by using semantic clusters for verbs, a large lexicon of verb classes derived from biomedical literature, we are able to improve the performance of common pretrained embeddings in downstream tasks by retrofitting them to verb classes. We present a simple and computationally efficient approach using a widely-available off-the-shelf retrofitting algorithm to align pretrained embeddings according to semantic verb clusters. We achieve state-of-the-art results on text classification and relation extraction tasks.

Publication
Proceedings of the 18th BioNLP Workshop and Shared Task