Carnegie Mellon University

Close-up image of small white spheres connected to each other by green thread

Entity Linking

By Graham Neubig

The Problem

We would like to start tackling the problem of entity linking in low-resource languages. This is a very hard problem due to the lack of data, and pre-trained models tend to do poorly on long-tail phenomena such as named entities.

The Idea

First, leverage multimodal information such as the surrounding image context to generate candidates for and disambiguate entities. This is useful because while language is specific, images are (relatively) universal, so we can process them reasonably well even in different languages.

Second, treat entity identification and disambiguation as (visual) question answering problems, to make it possible to leverage large language models that can be pre-trained on large amounts of unsupervised data in the language or from the culture’s visual context.

Third, possibly leverage additional context to provide knowledge about the relevant entities. This could include knowledge graphs, which may also be multi-lingual and/or multi-model, and leverage retrieval to retrieve relevant context from knowledge sources such as Wikipedia or Google Images. In particular, these methods can be expected to improve robustness to long-tail entities, as knowledge graphs have relatively good coverage of these entities, and retrieval-based models tend to perform better on the long tail.