Annotating dialogue acts in speech data: Problematic issues and basic dialogue act categories

Verdonik, Darinka

doi:10.1075/ijcl.20165.ver

Article published In:

International Journal of Corpus Linguistics
Vol. 28:2 (2023) ► pp.144–171

Annotating dialogue acts in speech data

Problematic issues and basic dialogue act categories

Darinka Verdonik | University of Maribor

The aims of this paper are to detect the most problematic issues related to dialogue act annotation in speech corpora and to define basic categories of dialogue acts. I critically examine and test generic schemes that represent different lines of dialogue act annotation: AMI, DART, ISO 24617–2 and SWBD-DAMSL. It is found that the most problematic issues regarding dialogue act annotation are related to the distinction between the semantic and pragmatic meanings of utterances, the annotation of metadiscourse, and the adequacy and informativeness of the tagset. The identified basic dialogue act categories are information providing, information seeking, actions, social acts and metadiscourse. The findings help improve dialogue act annotation.

Keywords: speech act, communicative function, metadiscourse, dialogue tagset, corpus pragmatics

Article outline

1.Introduction
2.Dialogue act annotation schemes
3.Methodology
- 3.1Selecting dialogue act annotation schemes
- 3.2Test data
- 3.3Annotation process
- 3.4Analytical procedure
4.Dialogue act annotation
- 4.1Applicability to a new language
- 4.2Utterance meaning
- 4.3Ambiguity
  - 4.3.1Basic unit
  - 4.3.2Tags
- 4.4Adequacy
- 4.5Informativeness
5.Dialogue act categories
- 5.1Information-providing acts
- 5.2Information-seeking acts
- 5.3Action acts
- 5.4Social acts
- 5.5Metadiscourse acts
6.Conclusions
References

Published online: 8 August 2022

https://doi.org/10.1075/ijcl.20165.ver

References (46)

References

Alexandersson, J., Buschbeck-Wolf, B., Fujinami, T., Maier, E., Reithinger, N., Schmitz, B., & Siegel, M. (1997). Dialogue Acts in VERBMOBIL-2. Report 204. DFKI GmbH, Saarbrücken, Germany. [URL]

Allen, J. F., Schubet, L. K., Ferguson, G., Heeman, P., Hwang, C. H., Kato, T., Light, M., Martin, N. G., Miller, B. W., Poesio, M., & Traum, D. R. (1994). The TRAINS project: A Case Study in Building Conversational Planning Agent. TRAINS technical note 94–3. The University of Rochester. [URL]

Allen, J., & Core, M. (1997). Draft of DAMSL: Dialog Act Markup in Several Layers. [URL]

AMI. (2005). Guidelines for Dialogue Act and Addressee Annotation Version 1.0. [URL]

Austin, J. L. (1975). How to Do Things with Words (2nd ed.). Oxford University Press.

Barras, C., Geoffrois, E., Wu, Z., & Liberman, M. (2000). Transcriber: Development and use of a tool for assisting speech corpora production. Speech Communication, 33(1–2), 5–22.

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman Grammar of Spoken and Written English. Longman.

Bunt, H. (1994). Context and Dialogue Control. Think Quarterly, 31, 19–34.

(1995). Dynamic interpretation and dialogue theory. In M. M. Taylor, F. Neel, & D. G. Bouwhuis. (Eds.), The Structure of Multimodal Dialogue (pp. 139–188). John Benjamins.

(2009). The DIT++ taxonomy for functional dialogue markup. In D. Heylen, C. Pelachaud, R. Catizone, & D. Traum. AMAAS 2009 Workshop ‘Towards a Standard Markup Language for Embodied Dialogue Acts’ Proceedings (pp. 13–23). Budapest. [URL]

(2019). Guidelines for Using ISO Standard 24617-2. [URL]

Bunt, H. C., & Black, B. (2000). The ABC of computational pragmatics. In H. C. Bunt & W. Black. (Eds.), Computational Pragmatics: Abduction, Belief and Context. John Benjamins.

Clark, A., & Popescu-Belis, A. (2004). Multi-level Dialogue Act Tags. In Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue at HLT-NAACL 2004 (pp. 163–170). Association for Computational Linguistics. [URL]

De Felice, R., Darby, J., Fisher, A., & Peplow, D. (2013). A classification scheme for annotating speech acts in a business email corpus. ICAME Journal, 37 1, 71–105. [URL]

Dhillon, R., Bhagat, S., Carvey, H., & Shriberg, E. (2004). Meeting Recorder Project: Dialog Act Labeling Guide. ICSI Technical Report TR-04-002. [URL].

Di Eugenio, B., Jordan, P. W., & Pylkkänen, L. (1998). The COCONUT Project: Dialogue Annotation Manual. ISP Technical Report 98-1, University of Pittsburgh.

Godfrey, J., & Holliman, E. (1997). Switchboard-1 Release 2. Linguistic Data Consortium. [URL]

Hyland, K. (2005). Metadiscourse: Exploring Interaction in Writing. Continuum.

Irie, Y., Matsubara, S., Kawaguchi, N., Yamaguchi, Y., & Inagaki, Y. (2006). Layered speech-act annotation for spoken dialogue corpus. In LREC 2006 (pp. 1584–1589). [URL]

ISO 24617-2. (2012). ISO DIS 24617-2 Language resource management – Semantic annotation framework (SemAF), Part 2: Dialogue acts. Geneva.

Jurafsky, D. (2004). Pragmatics and computational linguistics. In L. R. Horn & G. Ward. (Eds.), The Handbook of Pragmatics (pp. 578–604). Blackwell.

Jurafsky, D., Shriberg, E., & Biasca, D. (1997). Switchboard SWBD-DAMSL shallow-discourse-function annotation. Coders manual, draft 13. University of Colorado at Boulder & +SRI International. [URL]

Kang, S., Kim, H., & Seo, J. (2010). A reliable multidomain model for speech act classification. Pattern Recognition Letters, 31 1, 71–74.

Kirk, J. M. (2013). Beyond the structural levels of language: An introduction to the SPICE-Ireland corpus and its uses. In J. Cruickshank & R. McColl Millar. (Eds.), After the Storm: Papers from the Forum for Research on the Languages of Scotland and Ulster Triennial Meeting (pp. 207–232). Forum for Research on the Languages of Scotland and Ireland. [URL]

Klein, M. (1999). An overview of the state of the art of coding schemes for dialogue act annotation. Lecture Notes in Computer Science, 1 (1692), 274–279.

Klein, M., Bernsen, N. O., Davies, S., Dybkjær, Garrido, J., Kasch, H., Mengel, A., Pirrelli, V., Poesio, M., Quazza, S., & Soria, C. (1998). MATE Deliverable D1.1: Supported Coding Schemes. 4. Dialogue Acts. [URL]

Leech, G. N. (1980). Explorations in Semantics and Pragmatics. John Benjamins.

Leech, G., & Weisser, M. (2003). Generic speech act annotation for task-oriented dialogues. In D. Archer, P. Rayson, A. Wilson, & T. McEnery. (Eds.), Proceedings of the Corpus Linguistics 2003 Conference. Lancaster University, UCREL Technical Papers, vol. 161. [URL]

Leech, G., Weisser, M., Wilson, A., & Grice, M. (2000). Survey and guidelines for the representation and annotation of dialogue. In D. Gibbon, I. Mertins, & R. Moore. (Eds), Handbook of Multimodal and Spoken Language Systems (pp. 10–11). Kluwer.

Levinson, S. C. (1983). Pragmatics. Cambridge University Press.

(2017). Speech acts. In Y. Huang. (Ed.), The Oxford Handbook of Pragmatics (pp. 199–216).

McAllister, P. G. (2015). Speech acts: A synchronic perspective. In K. Aijmer & C. Rühlemann. Corpus Pragmatics: A Handbook (pp. 29–51). Cambridge University Press.

Meteer, M. (1995). Dysfluency Annotation Stylebook for the Switchboard Corpus. University of Pennsylvania.

Morris, C. W. (1938). Foundations of the theory of signs. In O. Neurath, R. Carnap, & C. Morris. (Eds.), International Encyclopedia of Unified Science (pp. 77–138). University of Chicago Pess.

Park, J., & Kim, Y. (2018). A novel speech-act coding scheme to visualize the intention of crew communications to cope with simulated off-normal conditions of nuclear power plants. Reliability Engineering and System Safety, 178 1, 236–246.

Qadir, A., & Riloff, E. (2011). Classifying sentences as speech acts in message board posts. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (pp. 748–758). Association for Computational Linguistics. [URL]

Searle, J. R. (1979). Expression and Meaning: Studies in the Theory of Speech Acts. Cambridge University Press.

Vail, A. K., & Boyer, K. E. (2014). Identifying effective moves in tutoring: On the refinement of dialogue act annotation schemes. In S. Trausan-Matu, K. Elizabeth Boyer, M. Crosby, & Kitty Panourgia. (Eds.), ITS 2014, LNCS 8474 1 (pp. 199–209). Springer.

Verdonik, D., Kosem, I., Zwitter Vitez, A., Krek, S., & Stabej, M. (2013). Compilation, transcription and usage of a reference speech corpus: The case of the Slovene corpus GOS. Language Resources and Evaluation Journal, 47 (4), 1031–1048.

Weisser, M. (2014). Speech act annotation. In K. Aijmer & C. Rühlemann. (Eds.), Corpus Pragmatics: A Handbook (pp. 84–113). Cambridge University Press.

(2016). DART – The dialogue annotation and research tool. Corpus Linguistics and Linguistic Theory, 12 (2), 355–388.

(2018). How to Do Corpus Pragmatics on Pragmatically Annotated data: Speech acts and Beyond. John Benjamins.

(2019a). The DART Taxonomy v. 3. [URL]

(2019b). The DART annotation scheme: Form, applicability & application. Studia Neophilologica, 91 (2), 131–153.

(2020). Speech acts in corpus pragmatics: Making the case for an extended taxonomy. International Journal of Corpus Linguistics, 25 (4), 400–425.

Zhao, T., & Kawahara, T. (2019). Joint dialog act segmentation and recognition in human conversations using attention to dialog context. Computer Speech & Language, 57 1, 108–127. [URL].

Cited by (1)

Cited by one other publication

Ciambella, Fabio

2024. Teaching English as a Second Language with Shakespeare,

This list is based on CrossRef data as of 4 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.