Edited by Anton Benz, Manfred Stede and Peter Kühnlein
[Pragmatics & Beyond New Series 223] 2012
► pp. 165–182
Identifying the presence of coherence relations automatically is known to be a difficult task for a number of reasons. One of the problems involved has so far received only little attention: Connectives (the primary source of information for relation analysis) can be complex, i.e. consist of several words that possibly are not adjacent, as in the case of “if .. then”. In this paper, we first offer a classification of complex connectives for German according to their structural properties, which leads us to propose six different groups. Furthermore, we describe how the structural properties can be captured in a formal “discourse marker lexicon”, which is in machine-readable (XML-based) form. Finally, we describe how this lexicon can in turn be employed for the task of automatic local coherence analysis, i.e. the identification of individual relations between adjacent spans of text.