Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis: The case of apology

Yu, Danni; Li, Luyang; Su, Hang; Fuoli, Matteo

doi:10.1075/ijcl.23087.yu

Article published In:

International Journal of Corpus Linguistics: Online-First Articles

Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis

The case of apology

Danni Yu | Beijing Foreign Studies University

Luyang Li | Beijing Foreign Studies University

Hang Su | Sichuan International Studies University

Matteo Fuoli | University of Birmingham

Certain forms of linguistic annotation, like part of speech and semantic tagging, can be automated with high accuracy. However, manual annotation is still necessary for complex pragmatic and discursive features that lack a direct mapping to lexical forms. This manual process is time-consuming and error-prone, limiting the scalability of function-to-form approaches in corpus linguistics. To address this, our study explores the possibility of using large language models (LLMs) to automate pragma-discursive corpus annotation. We compare GPT-3.5 (the model behind the free-to-use version of ChatGPT), GPT-4 (the model underpinning the precise mode of Bing chatbot), and a human coder in annotating apology components in English based on the local grammar framework. We find that GPT-4 outperformed GPT-3.5, with accuracy approaching that of a human coder. These results suggest that LLMs can be successfully deployed to aid pragma-discursive corpus annotation, making the process more efficient, scalable, and accessible.

Keywords: corpus pragmatics, large language models, pragma-discursive corpus annotation, local grammar, ChatGPT

Article outline

1.Introduction
2.Corpus annotation: Long-standing challenges, new opportunities
- 2.1Challenges in automating pragmatic and discourse-level annotation
- 2.2LLM-assisted corpus annotation
3.Data and methods
- 3.1The annotation task
- 3.2Prompt design
- 3.3Performance evaluation
4.Results
- 4.1GPT-3.5 versus GPT-4
- 4.2GPT-4 versus a human annotator
  - 4.2.1Recognition of no apology
  - 4.2.2Recognition of apologising
  - 4.2.3Recognition of reason
  - 4.2.4Recognition of apologiser
  - 4.2.5Recognition of apologisee
  - 4.2.6Recognition of intensifier
- 4.3Summary of findings
5.Conclusion
Notes
References

Published online: 3 June 2024

https://doi.org/10.1075/ijcl.23087.yu

References

Baker, P., Brookes, G., & Evans, C.

(2019) The language of patient feedback: A corpus linguistic study of online health communication. Routledge.

Blum-Kulka, S., House, J., & Kasper, G.

(1989) (Eds.). Cross-cultural pragmatics: Requests and apologies. Ablex Publishing Corporation.

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., … & Amodei, D.

(2020) Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.). Advances in neural information processing systems 33: 34th conference on neural information processing systems (pp. 1877–1901). Neural Information Processing Systems Foundation, Inc.

Cavasso, L., & Taboada, M.

(2021) A corpus analysis of online news comments using the Appraisal framework. Journal of Corpora and Discourse Studies, (4), 1–38.

Cheng, W., & Ching, T.

(2018) ‘Not a guarantee of future performance’: The local grammar of disclaimers. Applied Linguistics, 39 (3), 263–301.

Ding, B., Qin, C., Liu, L., Chia, Y. K., Joty, S., Li, B., & Bing, L.

(2023) Is GPT-3 a good data annotator? arXiv.

Frei, J., & Kramer, F.

(2023) Annotated dataset creation through large language models for non-English medical NLP. Journal of Biomedical Informatics, (145).

Fuoli, M., & Hommerberg, C.

(2015) Optimising transparency, reliability and replicability: Annotation principles and inter-coder agreement in the quantification of evaluative expressions. Corpora, 10 (3), 315–349.

Fuoli, M., Littlemore, J., & Turner, S.

(2022) Sunken ships and screaming banshees: Metaphor and evaluation in film reviews. English Language & Linguistics, 26 (1), 75–103.

Garside, R., Leech, G., & McEnery, T.

(1997) Corpus annotation: Linguistic information from computer text corpora. Routledge.

Garside, R., & Smith, N.

(1997) A hybrid grammatical tagger: CLAWS4. In R. Garside, G. Leech, & T. McEnery (Eds.), Corpus annotation: Linguistic information from computer text corpora (pp. 102–121). Routledge.

Gilardi, F., Alizadeh, M., & Kubli, M.

(2023) ChatGPT outperforms crowd-workers for text-annotation tasks. arXiv.

He, X., Lin, Z., Gong, Y., Jin, A., Zhang, H., Lin, C., Jiao, J., Yiu, S. M., Duan, N., & Chen, W.

(2023) AnnoLLM: Making large language models to be better crowdsourced annotators. arXiv.

Hunston, S.

(2002) Pattern grammar, language teaching, and linguistic variation: Applications of a corpus-driven grammar. In R. Reppen, S. Fitzmaurice, & D. Biber (Eds.), Using corpora to explore linguistic variation (pp. 167–183). John Benjamins.

(2011) Corpus approaches to evaluation: Phraseology and evaluative language. Routledge.

Hunston, S., & Sinclair, J.

(2001) A local grammar of evaluation. In S. Hunston & G. Thompson (Eds.), Evaluation in text: Authorial stance and the construction of discourse. Oxford University Press.

Hunston, S., & Su, H.

(2019) Patterns, constructions, and local grammar: A case study of ‘evaluation.’ Applied Linguistics, 40 (4), 567–593.

Kirk, J. M.

(2016) The pragmatic annotation scheme of the SPICE-Ireland corpus. International Journal of Corpus Linguistics, 21 (3), 299–322.

Kolhatkar, V., Wu, H., Cavasso, L., Francis, E., Shukla, K., & Taboada, M.

(2020) The SFU opinion and comments corpus: A corpus for the analysis of online news comments. Corpus Pragmatics, (4), 155–190.

Leech, G.

(1993) Corpus annotation schemes. Literary and Linguistic Computing, 8 (4), 275–281.

(1997) Introducing corpus annotation. In R. Garside, G. Leech, & T. McEnery (Eds.), Corpus annotation: Linguistic information from computer text corpora (pp. 1–18) Routledge.

Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G.

(2023) Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language Processing. ACM Computing Surveys, 55 (9), 1–35.

Love, R., Dembry, C., Hardie, A., Brezina, V., & McEnery, T.

(2017) The Spoken BNC2014: Designing and building a spoken corpus of everyday conversations. International Journal of Corpus Linguistics, 22 (3), 319–344.

Lutzky, U., & Kehoe, A.

(2017a) “Oops, I didn’t mean to be so flippant”. A corpus pragmatic analysis of apologies in blog data. Journal of Pragmatics, (116), 27–36.

(2017b) “I apologise for my poor blogging”: Searching for apologies in the Birmingham Blog Corpus. Corpus Pragmatics, (1), 37–56.

Martin, J. R., & White, P. R. R.

(2005) The language of evaluation: Appraisal in English. Palgrave Macmillan.

McEnery, T., & Hardie, A.

(2012) Corpus linguistics. Cambridge University Press.

McEnery, T., & Wilson, A.

(2001) Corpus linguistics: An introduction. Edinburgh University Press.

Microsoft & OpenAI

(2023) Bing Chat (Apr-11-28-2023 version). [GPT-4 language model]. [URL]

Milà-Garcia, A.

(2018) Pragmatic annotation for a multi-layered analysis of speech acts: A methodological proposal. Corpus Pragmatics, (2), 265–287.

O’Keeffe, A.

(2018) “Corpus-based function-to-form approaches”. In A. H. Jucker, K. P. Schneider & W. Bublitz (Eds.), Methods in pragmatics (pp. 587–618). Mouton de Gruyter.

OpenAI

(2023) ChatGPT (Apr 11-28-2023 version). [Large language model]. [URL]

Page, R.

(2014) Saying ‘sorry’: Corporate apologies posted on Twitter. Journal of Pragmatics, (62), 30–45.

Põldvere, N., De Felice, R., & Paradis, C.

(2022) Advice in conversation: Corpus pragmatics meets mixed methods. Cambridge University Press.

Rayson, P., Archer, D., Piao, S., & McEnery, T.

(2004) The UCREL semantic analysis system. In Proceedings of the Workshop on Beyond Named Entity Recognition: Semantic Labelling for NLP Tasks in Association with the LREC 2004 (pp. 7–12).

Rühlemann, C., & Aijmer, K.

(2014) Corpus pragmatics: Laying the foundations. In Corpus pragmatics: A handbook (pp. 1–26). Cambridge University Press.

Simaki, V., Paradis, C., Skeppstedt, M., Sahlgren, M., Kucher, K., & Kerren, A.

(2020) Annotating speaker stance in discourse: The Brexit Blog Corpus. Corpus Linguistics and Linguistic Theory, 16 (2), 215–248.

Su, H.

(2017) Local grammars of speech acts: An exploratory study. Journal of Pragmatics, ( 111 ), 72–83.

(2021) Changing patterns of apology in spoken British English: A local grammar based diachronic investigation. Pragmatics and Society, 12 (3), 410–436.

Su, H., & Wei, N.

(2018) “I’m really sorry about what I said”: A local grammar of apology. Pragmatics, 28 (3), 439–462.

Su, H., & Zhang, L.

(2020) Local grammars and discourse acts in academic writing: A case study of exemplification in Linguistics research articles. Journal of English for Academic Purposes, ( 43 ), Article 100805.

Taylor, C.

(2016) Mock politeness in English and Italian. John Benjamins.

Wei, X., Cui, X., Cheng, N., Wang, X., Zhang, X., Huang, S., Xie, P., Xu, J., Chen, Y., Zhang, M., Jiang, Y., & Han, W.

(2023) Zero-shot information extraction via chatting with ChatGPT. arXiv.

Weisser, M.

(2014) Speech act annotation. In K. Aijmer & C. Rühlemann (Eds.), Corpus pragmatics: A handbook (pp. 84–110). Cambridge University Press.

(2016) DART – The dialogue annotation and research tool. Corpus Linguistics and Linguistic Theory, 12 (2), 355–388.

Yang, J., Jin, H., Tang, R., Han, X., Feng, Q., Jiang, H., Yin, B., & Hu, X.

(2023) Harnessing the power of LLMs in practice: A survey on ChatGPT and beyond. arXiv.

Yu, D.

(2022) Cross-cultural genre analysis: Investigating Chinese, Italian and English CSR reports. Routledge.

Zhao, T., & Kawahara, T.

(2019) Joint dialog act segmentation and recognition in human conversations using attention to dialog context. Computer Speech & Language, (57), 108–127.