Contrast-Ita Bank

Contrast-Ita Bank is a corpus annotated with discourse contrast relations in Italian. We annotate both explicit and implicit contrast relations (CONTRAST and CONCESSION), following the schema proposed in the Penn Discourse Treebank.

Contrast-Ita Bank consists of 169 news stories (for a total of 65,455 tokens). The documents correspond to the documents of the Fact-Ita Bank corpus, annotated with factuality information and, partially, with negation (Fact-Ita Bank-Negation). Originally, the documents were selected from the larger corpus Ita-TimeBank: a language resource manually annotated with temporal and event information. The same documents are also part of the I-CAB corpus: a corpus of Italian news annotated with temporal expressions and different types of entities (i.e. persons, organizations, locations, and geo-political entities).

Despite the existing information that had already been annotated in these documents, the annotation of contrast was carried out on raw text. We annotated a total of 372 relations.

Contrast-Ita Bank is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Contributors: Anna Feltracco, Bernardo Magnini, Elisabetta Jezek, Anne-Lyse Minard, Manuela Speranza

Publications or presentations containing research results obtained through the use of Contrast-Ita Bank should cite the following reference:

  • Anna Feltracco, Bernardo Magnini, and Elisabetta Jezek. Contrast-Ita Bank: A corpus for Italian Annotated with Discourse Contrast Relations. To appear in Proceedings of the Fourth Italian Conference on Computational Linguistic (CLiC-it 2017)

To obtain the data please fill the request form with your data (they will be maintained in a database at FBK): Download Contrast-Ita Bank