Article published in:The Pragmatics of Discourse Coherence: Theories and applications
Edited by Helmut Gruber and Gisela Redeker
[Pragmatics & Beyond New Series 254] 2014
► pp. 121–141
Resolving connective ambiguity
a prerequisite for discourse parsing
Automatic discourse parsing refers to the identification of coherence relations and deriving a structural description for a text. Such parsers can derive much information from the presence of surface cues, especially connectives. These lexical signals, however, are ambiguous: Many have additional, non-connective readings; also, many connectives can signal more than one coherence relation. In this paper, we discuss the first problem, focusing on English and German: How many connectives are ambiguous, and how frequent are these in the two languages? Then we examine computational approaches for resolving such ambiguities. For English, we provide an overview of relevant work by other researchers, while for German we largely present our own studies on the utility of part-of-speech tagging for connective disambiguation.
Published online: 26 November 2014
Asher, Nicholas, and Alex Lascarides
Berzlánovich, Ildikó, and Gisela Redeker
Brants, Sabine, Stefanie Dipper, Peter Eisenberg, Silvia Hansen, Esther König, Wolfgang Lezius, Christian Rohrer, George Smith, and Hans Uszkoreit
1992 “A Simple Rule-based Part-of-speech Tagger.” In Proceedings of the 3rd Conference on Applied Natural Language Processing (ANLP) , 152–155. Trento.
Carlson, Lynn, Daniel Marcu, and Mary Ellen Okurowski
Dipper, Stefanie, and Manfred Stede
2006 “Disambiguating Potential Connectives.” In Proceedings of Konferenz zur Verarbeitung natürlicher Sprache (KONVENS), 167–173. Konstanz.
Egg, Markus, and Gisela Redeker
2010 “How Complex is Discourse Structure?” In Proceedings of the Conference on Language Resources and Evaluation (LREC) , 1619–1623. Malta.
Hernault, Hugo, Helmut Prendinger, David A. duVerle, and Mitsuru Ishizuka
Hirschberg, Julia, and Diane Litman
Mann, William C., and Sandra A. Thompson
Pasch, Renate, Ursula Brauße, Eva Breindl, and Ulrich Hermman Waßner
Pitler, Emily, and Ana Nenkova
2009 “Using Syntax to Disambiguate Explicit Discourse Connectives in Text.” In Proceedings of the ACL/IJCNLP Conference Short Papers , 13–19. Suntec/Singapore.
Polanyi, Livia, and Remko Scha
1994 “A Syntactic Approach to Discourse Semantics.” In Proceedings of the 10th International Conference on Computational Linguistics (Coling) and 22nd Annual Meeting of the Association for Computational Linguistics (ACL) , 413–419. Stanford University.
Prasad, Rashmi, Nikhil Dinesh, Alan Lee, Eleni Miltsakaki, Livio Robaldo, Aravind Joshi, and Bonnie Webber
2008 “The Penn Discourse TreeBank 2.0.” In Proceedings of the Conference on Language Resources and Evaluation (LREC) , Marrakech.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik
Redeker, Gisela, Ildikó Berzlánovich, Nynke van der Vliet, Gosse Bouma, and Markus Egg
2012 “Multi-Layer Discourse Annotation of a Dutch Text Corpus.” In Proceedings of the Conference on Language Resources and Evaluation (LREC) , 2820–2825. Istanbul.
1994 “Probabilistic Part-of-speech Tagging Using Decision Trees.” In Proceedings of the International Conference on New Methods in Language Processing , Manchester.
Schneider, Angela, and Manfred Stede
2012 “Ambiguity in German Connectives: A Corpus Study.” In Proceedings of the Konferenz zur Verarbeitung natürlicher Sprache (KONVENS) , Vienna.
Stede, Manfred, and Arne Neumann
2014 “Potsdam Commentary Corpus 2.0: Annotation for Discourse Research.” In Proceedings of the Conference on Language Resources and Evaluation (LREC) , Reykjavik.
Taylor, Ann, Mitchell Marcus, and Beatrice Santorini
2009 “Genre Distinctions for Discourse in the Penn TreeBank.” In Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP , 674–682. Singapore.