Are online news comments like face-to-face conversation?: A multi-dimensional analysis of an emerging register

Ehret, Katharina; Taboada, Maite

doi:10.1075/rs.19012.ehr

Article published In:

Register Studies
Vol. 2:1 (2020) ► pp.1–36

Are online news comments like face-to-face conversation?

A multi-dimensional analysis of an emerging register

Katharina Ehret | Simon Fraser University

Maite Taboada | Simon Fraser University

This article focuses on the question of whether online news comments are like face-to-face conversation or not. It is a widespread view that online comments are like “dialogue”, with comments often being referred to as “conversations”. These assumptions, however, lack empirical back-up. In order to answer this question, we systematically explore register-relevant properties of online news comments using multi-dimensional analysis (MDA) techniques. Specifically, we apply MDA to establish what online comments are like by describing their linguistic features and comparing them to traditional registers (e.g. face-to-face conversation, academic writing). Thus, we tap the SFU Opinion and Comments Corpus and the Canadian component of the International Corpus of English. We show that online comments are not like spontaneous conversation but rather closer to opinion articles or exams, and clearly constitute a written register. Furthermore, they should be described as instances of argumentative evaluative language.

Keywords: register variation, social media language, corpus linguistics, multi-dimensional analysis

Article outline

1.Introduction
2.Social media language, online news comments and register analysis
3.Data
4.Multi-dimensional analysis
- 4.1Features and frequencies
- 4.2Factor analysis
5.Variational dimensions
6.Are online news comments like face-to-face conversation?
7.Concluding remarks
Acknowledgements
Notes
References

Published online: 10 April 2020

https://doi.org/10.1075/rs.19012.ehr

References (58)

References

Beckner, C., Blythe, R., Bybee, J., Christiansen, M. H., Croft, W., Ellis, N. C., Holland, J., Ke, J., Larsen-Freeman, D., & Schoenemann, T. (2009). Language is a complex adaptive system: Position paper. Language Learning, 59(s1), 1–26.

Benamara, F., Inkpen, D., & Taboada, M. (2018). Language in social media: Exploiting discourse and other contextual information. Special issue of Computational Linguistics, 44(4).

Bernini, G., & Schwartz, M. (2006). Pragmatic organization of discourse in the languages of Europe. Berlin: Mouton de Gruyter.

Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.

(1993). Representativeness in corpus design. Literary and Linguistic Computing, 8(4), 243–257.

Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge: Cambridge University Press.

Biber, D., & Egbert, J. (2016). Register variation on the searchable web: A multi-dimensional analysis. Journal of English Linguistics, 44(2), 95–137.

(2018). Register variation online. Cambridge: Cambridge University Press.

Biber, D., & Finegan, E. (1989). Styles of stance in English: Lexical and grammatical marking of evidentiality and affect. Text, 9(1), 93–124.

(Eds.) (1994). Sociolinguistic perspectives on register. Oxford: Oxford University Press.

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and written English. Harlow: Longman.

Bruce, R. F., & Wiebe, J. M. (1999). Recognizing subjectivity: A case study in manual tagging. Natural Language Engineering, 5(2), 187–205.

Bruce, Ian. 2010. Evolving Genres in Online Domains: The hybrid genre of the participatory news article. In A. Mehler, S. Sharoff, & M. Santini (Eds.), Genres on the web: Computational models and empirical studies (pp. 323–348). New York, NY: Springer.

Cambria, M. (2016). Commenting, interacting, reposting: A systemic-functional analysis of online newspaper comments. In S. Gardner & S. Alsop (Eds.), Systemic functional linguistics in the digital age (pp. 81–95). Sheffield: Equinox.

Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1(2), 245–276.

Clarke, I., & Grieve, J. (2019). Stylistic variation on the Donald Trump Twitter account: A linguistic analysis of tweets posted between 2009 and 2018. PLoS ONE 14(9): e0222062.

(2017). Dimensions of abusive language on Twitter. In Proceedings of the First Workshop on Abusive Language Online (pp. 1–10). Vancouver: Association for Computational Linguistics.

Coe, K., Kenski, K., & Rains, S. A. (2014). Online and uncivil? Patterns and determinants of incivility in newspaper website comments. Journal of Communication, 64(4), 658–679.

Collot, M., & Belmore, N. (1996). Electronic Language: A new variety of English. In S. C. Herring (Ed.), Computer-mediated communication: Linguistic, social, and cross-cultural perspectives (pp. 13–28). Amsterdam: John Benjamins.

Daems, J., Speelman, D., & Ruette, T. (2013). Register analysis in blogs: Correlation between professional sector and functional dimensions. Leuven Working Papers in Linguistics, 2(1), 1–27.

Demata, M., Heaney, D., & Herrring, S. C. (2018). Language and discourse of social media. New challenges, new approaches. Special issue of Altre Modernità, I–X.

Diakopoulos, N. (2015). Picking the NYT Picks: Editorial criteria and automation in the curation of online news comments. ISOJ Journal, 6(1), 147–166.

Diessel, H. (2017). Usage-based linguistics. Oxford research encyclopedia of linguistics. Oxford: Oxford University Press.

Dziuban, C. D., & Shirkey, E. C. (1974). When is a correlation matrix appropriate for factor analysis? Some decision rules. Psychological Bulletin, 81(6), 358–361.

Godes, D., & Mayzlin, D. (2004). Using online conversations to study word-of-mouth communication. Marketing Science, 23(4), 545–560.

Grieve, J., Biber, D., Friginal, E., & Nekrasova, T. (2010). Variation among blog text types: A multi-dimensional analysis. In A. Mehler, S. Sharoff, & M. Santini (Eds.), Genres on the web: Computational models and empirical studies (pp. 303–322). New York, NY: Springer.

Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2014). Multivariate data analysis: Pearson new international edition, always learning (7th ed.). London: Pearson Education.

Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. Harlow: Longman.

Herring, S. C. (1996a). Computer-mediated communication: Linguistic, social, and cross-cultural perspectives. Amsterdam: John Benjamins.

(1996b). Introduction. In S. C. Herring (Ed.), Computer-mediated communication: Linguistic, social, and cross-cultural perspectives (pp. 1–10). Amsterdam: John Benjamins.

(2004). Slouching toward the ordinary: Current trends in computer-mediated communication. New Media & Society, 6(1), 26–36.

Hunston, S. (2011). Corpus approaches to evaluation: Phraseology and evaluative language. New York, NY: Routledge.

Kiesling, S. F., Pavalanathan, U., Fitzpatrick, J., Han, X., & Eisenstein, J. (2018). Interactional stancetaking in online forums. Computational Linguistics, 44(4), 683–718.

Ko, K.-K. (1996). Structural characteristics of computer-mediated language: A comparative analysis of InterChange discourse. Electronic Journal of Communication/La Revue Électronique de Communication, 6(3), 1–28.

Kolhatkar, V., & Taboada, M. (2017a). Constructive language in news comments. In Proceedings of the First Workshop on Abusive Language Online (pp. 11–17). Vancouver: Association for Computational Linguistics.

(2017b). Using New York Times Picks to identify constructive comments. In Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism (pp. 100–105). Copenhagen: Association for Computational Linguistics.

Kolhatkar, V., Wu, H., Cavasso, L., Francis, E., Shukla, K., & Taboada, M. (2019). The SFU Opinion and Comments Corpus: A corpus for the analysis of online news comments. Corpus Pragmatics.

Marcoccia, M. (2004). On-line polylogues: Conversation structure and participation framework in internet newsgroups. Journal of Pragmatics, 36(1), 115–145.

McGuire, J. (2015, November 30). Uncivil dialogue: Commenting and stories about indigenous people. CBC News. Retrieved from <[URL]>

Mehler, A., Sharoff, S., & Santini, M. (2010). Genres on the web: Computational models and empirical studies. New York: Springer.

Moens, M.-F., Boiy, E., Mochales Palau, R., & Reed, C. (2007). Automatic detection of arguments in legal texts. In Proceedings of the 11th International Conference on Artificial Intelligence and Law (pp. 225–230). Palo Alto, CA: Association for Computing Machinery.

Napoles, C., Tetreault, J., Rosato, E., Provenzale, B., & Pappu, A. (2017). Finding good conversations online: The Yahoo News Annotated Comments Corpus. In Proceedings of the 11th Linguistic Annotation Workshop (pp. 13–23). Valencia.

Nauroth, P., Gollwitzer, M., Bender, J., & Rothmund, T. (2015). Social identity threat motivates science-discrediting online comments. PLoS One, 10(2), e0117476.

Newman, J., & Columbus, G. (2010). The ICE-Canada Corpus. (Version 1). Retrieved from <[URL]>

Nini, A. (2014). Multidimensional Analysis Tagger – Manual (Version 1.3). Retrieved from <[URL]>

North, S. (2007). ‘The voices, the voices’: Creativity in online conversation. Applied Linguistics, 28(4), 538–555.

Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A comprehensive grammar of the English language. Harlow: Longman.

R Core Team. (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from <[URL]>

Reagle, J. M. (2015). Reading the comments: Likers, haters, and manipulators at the bottom of the web. Cambridge: MIT Press.

Rösner, L., Winter, S., & Krämer, N. C. (2016). Dangerous minds? Effects of uncivil online comments on aggressive cognitions, emotions, and behavior. Computers in Human Behavior, 581, 461–470.

Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267–307.

Toutanova, K., Klein, D., Manning, C., & Singer, Y. (2003). Feature-rich part-of-speech tagging with a cyclic dependency network. Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 252–259. Stroudsburg, PA: ACL.

Tseronis, A. (2011). From connectives to argumentative markers: A quest for markers of argumentative moves and of related aspects of argumentative discourse. Argumentation, 25(4), 427–447.

van Eemeren, F. H., Houtlosser, P., & Snoeck Henkemans, A. F. (2007). Argumentative indicators in discourse: A pragma-dialectical study. New York, NY: Springer.

Weizman, E., & Dori-Hacohen, G. (2017). On-line commenting on opinion editorials: A cross-cultural examination of face work in the Washington Post (USA) and NRG (Israel). Discourse, Context & Media, 191, 39–48.

White, L. (2003). Second language acquisition and universal grammar. Cambridge: Cambridge University Press.

Woollaston, V. (2013, September 30). Online conversations are damaging how we speak to each other in real life: Author claims people could soon “forget” how to handle social situations. Daily Mail. Retrieved from <[URL]>

Yates, S. J. (1996). Oral and written linguistic aspects of computer conferencing: A corpus based study. In S. Herring (Ed.), Computer-Mediated Communication: Linguistic, social, and cross-cultural perspectives (pp. 29–46). Amsterdam: John Benjamins.

Cited by (10)

Cited by ten other publications

Order by:

Kalabikhina, Irina, Natalia Loukachevitch, Eugeny Banin & Anton Kolotusha

2024. Text as Data in Demography: Russian-language experience. In Recent Trends in Demographic Data [Working Title],

Shakir, Muhammad

2024. An exploratory investigation of functional variation in South Asian online Englishes. English Language and Linguistics ► pp. 1 ff.

Tao, Xuelian & Vahid Aryadoust

2024. A Multidimensional Analysis of a High-Stakes English Listening Test: A Corpus-Based Approach. Education Sciences 14:2 ► pp. 137 ff.

Aguiar, Joana & Pilar Barbosa

2023. Emotional Deixis in Online Hate Speech. In Hate Speech in Social Media, ► pp. 139 ff.

Chiba, Yuya & Ryuichiro Higashinaka

2023. Analyzing Variations of Everyday Japanese Conversations Based on Semantic Labels of Functional Expressions. ACM Transactions on Asian and Low-Resource Language Information Processing 22:2 ► pp. 1 ff.

Kalabikhina, Irina, Ekaterina Zubova, Natalia Loukachevitch, Anthony Kolotusha, Zarina Kazbekova, Evgeny Banin & German Klimenko

2023. Identifying Reproductive Behavior Arguments in Social Media Content Users’ Opinions through Natural Language Processing Techniques. Population and Economics 7:2 ► pp. 40 ff.

Trnavac, Radoslava & Maite Taboada

2023. Engagement and constructiveness in online news comments in English and Russian. Text & Talk 43:2 ► pp. 235 ff.

Yu, Xiaoli

2022. A multi-dimensional analysis of English-medium massive open online courses (MOOCs) video lectures in China. Journal of English for Academic Purposes 55 ► pp. 101079 ff.

Ehret, Katharina & Maite Taboada

2021. The interplay of complexity and subjectivity in opinionated discourse. Discourse Studies 23:2 ► pp. 141 ff.

Ehret, Katharina & Maite Taboada

2021. Characterising Online News Comments: A Multi-Dimensional Cruise Through Online Registers. Frontiers in Artificial Intelligence 4

This list is based on CrossRef data as of 5 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.