The RTE3-ITA dataset is the Italian translation of the Textual Entailment English dataset used in the RTE-3 Challenge.

Like its English counterpart, the Italian RTE-3 dataset is composed of a development set and a test set, each containing 800 T/H pairs. RTE3-ITA has the following characteristics:

  • all T/H pairs were translated into Italian by a professional translator

  • all information related to the English T/H pairs (e.g. length of T, task) was imported into the Italian dataset

  • all the Italian T/H pairs were judged for entailment. In 15 cases a disagreement with respect to English was found. In the RTE3-ITA dataset, the following T/H pairs have a different entailment judgment with respect to the corresponding English ones:

* DEV SET: IDs 17, 51, 177, 340, 351, 388, 490, 549, 604, 658, 663

* TEST SET: IDs 43, 495, 496, 663

The RTE3-ITA dataset is licensed under a Creative Commons Attribution 3.0 Unported License.


Contact: Bernardo Magnini