The ClearTK-TimeML module provides annotators for extracting events, times and temporal relations. This module includes code and pre-built models for the top ranking temporal relation identification system in TempEval 2013. Roughly, the system identifies events and times with around 80% precision and 75% recall, and identifies temporal relations with around 85% accuracy.
The ClearTK-TimeML module also includes some code for other related tasks, such as TempEval 2007, TempEval 2010 and the annotated data of (Bethard et. al. 2007), though pre-built models are not included for these tasks, and this code is less actively maintained.
If you just want to run the pre-built models on a piece of text, then the TimeMlAnnotate class is the place to start. This class can be run at the command line, and takes an input file (or directory of input files) and an output directory to which the events, times and temporal relations will be output in TimeML format.
However, most users will probably want to add the annotators to their own pipeline and then work directly with the CAS rather than writing out TimeML documents. The key elements of the TimeML pipeline are listed in the code:
... TimeAnnotator.FACTORY.getAnnotatorDescription(), TimeTypeAnnotator.FACTORY.getAnnotatorDescription(), EventAnnotator.FACTORY.getAnnotatorDescription(), EventTenseAnnotator.FACTORY.getAnnotatorDescription(), EventAspectAnnotator.FACTORY.getAnnotatorDescription(), EventClassAnnotator.FACTORY.getAnnotatorDescription(), EventPolarityAnnotator.FACTORY.getAnnotatorDescription(), EventModalityAnnotator.FACTORY.getAnnotatorDescription(), ... TemporalLinkEventToDocumentCreationTimeAnnotator.FACTORY.getAnnotatorDescription(), TemporalLinkEventToSameSentenceTimeAnnotator.FACTORY.getAnnotatorDescription(), TemporalLinkEventToSubordinatedEventAnnotator.FACTORY.getAnnotatorDescription(), ...
These annotators will generate
TemporalLink annotations in the CAS that you can then use as you like within your pipeline. When creating your own pipelines including these annotators, be sure to check the full pipeline in TimeMlAnnotate, as the models require a variety of pre-processing annotators to provide sentences, tokens, part-of-speech tags, stems, syntactic parses, etc.
As shown in the previous section, the ClearTK-TimeML module provides annotators for various tasks related to identifying and categorizing events, time expressions and temporal relations. All of these are machine learning models trained with a small number of linguistic features. Here is a brief description of the main tasks performed by each module:
TimeAnnotator- Identifies time expressions in the text and adds
Timeannotations to the CAS
TimeTypeAnnotator- Sets the
EventAnnotator- Identifies event expressions in the text and adds
Eventannotators to the CAS
EventTenseAnnotator- Sets the
EventAspectAnnotator- Sets the
EventClassAnnotator- Sets the
EventPolarityAnnotator- Sets the
EventModalityAnnotator- Sets the
TemporalLinkEventToDocumentCreationTimeAnnotator- Identifies temporal relations between
Eventannotations and the
DocumentCreationTimeannotation and adds corresponding
TemporalLinkannotations to the CAS. Note: This annotator requires that the CAS already contain a
DocumentCreationTime, which you must supply. The pipeline in
TimeMlAnnotatecreates an empty (fake) document creation time, but for any real application, you should create a
DocumentCreationTimein the CAS based on your file metadata.
TemporalLinkEventToSameSentenceTimeAnnotator- Identifies temporal relations between
Timeannotations in the same sentence (for some, not all, syntactic relations between events and times) and adds corresponding
TemporalLinkannotations to the CAS.
TemporalLinkEventToSubordinatedEventAnnotator- Identifies temporal relations between
Eventannotations where one
Eventis syntactically dominated by another, and adds corresponding
TemporalLinkannotations to the CAS.