KRAUTS (Korpus of newspapeR Articles with Underlinded Temporal expressionS) is a German temporally annotated news corpus accompanied with TimeML annotation guidelines for German. It was developed at Fondazione Bruno Kessler, Trento, Italy and at the Max Planck Institute for Informatics, Saarbrücken, Germany. Our goal is to boost temporal tagging research [1] for German.

The corpus is publicly available and is described in:

Jannik Strötgen, Anne-Lyse Minard, Lukas Lange, Manuela Speranza, and Bernardo Magnini. KRAUTS: A German Temporally Annotated News Corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 7-12, 2018.

KRAUTS contains articles from the daily newspaper Dolimiten and from the weekly newspaper Die Zeit.

The annotation guidelines are strongly based on the guidelines defined for Italian, i.e., the It-TimeML guidelines [2]. Our Annex to the It-TimeML guidelines contains (annotated) examples in German and extensions needed to adapt the It-TimeML guidelines to the specific morpho-syntactic features of German. It is available on the It-TimeML website.

[1] Jannik Strötgen and Michael Gertz: Domain-Sensitive Temporal Tagging, Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers, 2016.

[2] Tommaso Caselli and Rachele Sprugnoli: It-TimeML, TimeML Annotation Guidelines for Italian, version 1.4, 2015.

KRAUTS download website