ACEtoWiki has been created by adding a manual annotation layer connecting the English ACE-2005 Corpus to Wikipedia. The non-pronominal mentions contained in the English ACE 2005 corpus (i.e. the named (NAM) and nominal (NOM) mentions) have been manually annotated with links to appropriate Wikipedia articles.
Each mention of type NAM is annotated with a link to a Wikipedia page describing the referred entity. For instance, “George Bush” is annotated with a link to the Wikipedia page George_W._Bush. NOM mentions are annotated with a link to the Wikipedia page which provides a description of its appropriate sense. Note that the object of linking is the textual description of an entity, and not the entity itself. Mentions of type NOM can often be linked to more than one Wikipedia page. In such cases, links are sorted in order of relevance, where the first link corresponds to the most specific sense for that term in its context. For instance, for the NOM mention “President” which in the context identifies the United States President George Bush the following links are selected as appropriate: President_of_the_ United_States and President.
ACEtoWIKI is freely distributed for both academic and commercial purposes, and is licensed under a Creative Commons Attribution 4.0 International License.
The ACE 2005 corpus is distributed by LDC, while the ACEtoWIKI annotation can be downloaded HERE
Publications or presentations containing results obtained through the use of ACEtoWIKI should cite the following reference:
Luisa Bentivogli, Pamela Forner, Claudio Giuliano, Alessandro Marchetti, Emanuele Pianta, Kateryna Tymoshenko. Extending English ACE 2005 Corpus Annotation with Ground-truth Links to Wikipedia. Proceedings of COLING 2010 Workshop on “The People's Web Meets NLP: Collaboratively Constructed Semantic Resources”, Beijing, China, August 28, 2010.
Contact: Claudio Giuliano