While the language of the internet has been an increasingly popular research topic, there remain many understudied areas and topics which deserve more attention. This study explores register variation within the social media website Reddit using the multi-dimensional approach developed by Douglas Biber. Reddit, the third most popular English-language social media website after the giants Facebook and Twitter, is made up of thousands of user-created ‘subreddits’, subcommunities centered around different topics, where users make posts and comment on them. Many different communities and topic areas under one roof makes Reddit a particularly fruitful source of research material. In this paper, three register dimensions are extracted from data collected over one month from a group of thirty-seven subreddits: ‘On-line Subjective Production’, ‘Informational Style’ and ‘Instructional Focus’. These dimensions describe register variation within Reddit in meaningful ways. They are also in line with suggested register universals (Biber 2014).
Berber Sardinha, T. (2014). Comparing internet and pre-internet registers. In T. Berber-Sardinha & M. Veirano-Pinto (Eds.), Multi-dimensional analysis, 25 years on: A tribute to Douglas Biber (pp. 81–105). Amsterdam: John Benjamins.
Biber, D., & Egbert, J. (2015). Using grammatical features for automatic register identification in an unrestricted corpus of documents from the open web. Journal of Research Design and Statistics in Linguistics and Communication Science, 2(1), 3–36.
Biber, D., & Egbert, J. (2016). Register variation on the searchable web: A multi-dimensional analysis. Journal of English Linguistics, 44(2), 95–137.
Biber, D., & Gray, B. (2013). Being specific about historical change: The influence of sub-register. The Journal of English Linguistics, 411, 104–134.
Biber, D., & Kurjian, J. (2007). Towards a taxonomy of web registers and text types: A multi-dimensional analysis. In M. Hundt, N. Nesselhauf, & C. Biewer (Eds.), Corpus Linguistics and the Web (pp. 109–132). Amsterdam: Rodopi.
Chandrasekharan, E., Pavalanathan, U., Srinivasan, A., Glynn, A., Eisenstein, J., & Gilbert, E. (2017). You can’t stay here: The effectiveness of Reddit’s 2015 ban through the lens of hate speech. Proceedings of the ACM on Human-Computer Interaction, 11.
Cole, J. R., Ghafurian, M., & Reitter, D. (2017, November13). Is word adoption a grassroots process? An analysis of Reddit communities. In D. Lee, Y. R. Osgood, & R. Thomson (Eds.), International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (pp. 236–241). Berlin: Springer.
Collot, M., & Belmore, N. (1996). Electronic language: A new variety of English. In S. C. Herring (Ed.), Computer-mediated communication (pp. 13–28). Amsterdam/Philadelphia: John Benjamins.
Conrad, S., & Biber, D. (Eds.). (2001). Variation in English: Multi-dimensional studies. Harlow: Pearson Education.
Coscia, M. (2018). Popularity spikes hurt future chances for viral propagation of protomemes. Communications of the ACM, 61(1), 70–77.
Davies, M. (2016). Corpus of Online Registers of English (CORE). Available from <[URL]>
De Choudhury, M., & De, S. (2015). Mental health discourse on Reddit: Self-disclosure, social support, and anonymity. In Proceedings of the 8th International Conference on Weblogs and Social Media, ICWSM 2014 (pp. 71–80).
Egbert, J., Biber, D., & Davies, M. (2015). Developing a bottom-up, user-based method of web register classification. Journal of the Association for Information Science and Technology, 66(9), 1817–1831.
Eisenstein, J. (2013). What to do about bad language on the internet. In Proceedings of the North American chapter of the Association for Computational Linguistics (NAACL) 2013 (pp. 359–369).
Finlay, S. C. (2014). Age and gender in Reddit commenting and success. Journal of Information Science Theory and Practice, 2(3), 18–28.
Friginal, E. (2013). Twenty-five years of Biber’s multi-dimensional analysis [Special Issue]. Corpora, 8(2).
Gkotsis, G., Oellrich, A., Hubbard, T., & Dobson, R. (2016). The language of mental health problems in social media. In Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology (pp. 63–73). Stroudsburg, PA: Association for Computational Linguistics.
Grieve, J., Biber, D., Friginal, E., & Nekrasova, T. (2011). Variation among blog text types: A multi-dimensional analysis. In A. Mehler, S. Sharoff, & M. Santini (Eds.), Genres on the web: Corpus studies and computational models (pp. 302–322). New York, NY: Springer.
Haralabopoulos, G., Anagnostopoulos, I., & Zeadally, S. (2015). Lifespan and propagation of information in on-line social networks: A case study based on Reddit. Journal of Network and Computer Applications, 561, 88–100.
Hardy, J., & Friginal, E. (2012). Filipino and American online communication and linguistic variation. World Englishes, 31(1), 1–19.
Hess, C. W., Haug, H. T., & Landry, R. G. (1989). The reliability of type-token ratios for the oral language of school age children. Journal of Speech and Hearing Research, 321, 536–540.
Hess, C. W., Sefton, K. M., & Landry, R. G. (1986). Sample size and type-token ratios for oral language of preschool children. Journal of Speech and Hearing Research, 291, 129–134.
Huang, Y., Guo, D., Kasakoff, A., & Grieve, J. (2016). Understanding US regional linguistic variation with Twitter data analysis. Computers, Environment and Urban systems, 591, 244–255.
Jonsson, E. (2015). Conversational writing: A multidimensional study of synchronous and supersynchronous computer-mediated communication. Frankfurt: Peter Lang.
Literat, I., & van den Berg, S. (2017). Buy memes low, sell memes high: vernacular criticism and collective negotiations of value on Reddit’s MemeEconomy. Information, Communication & Society.
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 55–60).
McEwan, B. (2016). Communication of communities: Linguistic signals of online groups. Information, Communication & Society, 19(9), 1233–1249.
Munro, R., & Manning, C. D. (2012). Short message communications: Users, topics, and in-language processing. In ACM DEV ’12 Proceedings of the 2nd ACM Symposium on Computing for Development.
Park, A., & Conway, M. (2018). Examining thematic similarity, difference, and membership in three online mental health communities from Reddit: A text mining and visualization approach. Computers in Human Behavior, 781, 98–112.
Pavalanathan, U., Fitzpatrick, J., Kiesling, S. F., & Eisenstein, J. (2017). A multidimensional lexicon for interpersonal stancetaking. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (pp. 884–895).
Revelle, W. (2017). psych: Procedures for psychological, psychometric, and personality research (Version 1.7.5). Illinois, USA: Northwestern University. Retrieved from <[URL]>
Richterich, A. (2014). ‘Karma, precious karma!’ Karmawhoring on Reddit and the front page’s econometrisation. Journal of Peer Production, 41. Retrieved from <[URL]>
Schnoebelen, T. (2012). Do you smile with your nose? Stylistic variation in Twitter emoticons. University of Pennsylvania Working Papers in Linguistics, 18(2), 115–125.
Singer, P., Ferrara, E., Kooti, F., Strohmaier, M., & Lerman, K. (2016). Evidence of online performance deterioration in user sessions on Reddit. PLoS ONE, 11(8).
Stewart, I., & Eisenstein, J. (2018, February21). Making “fetch” happen: The influence of social and linguistic context on the success of lexical innovations. arXiv:17091.00345v3 [cs.CL].
Titak, A., & Roberson, A. (2013). Dimensions of web registers: An exploratory multi-dimensional comparison. Corpora, 8(2), 239–271.
Tsou, A. (2016). How does the front page of the internet behave? Readability, emoticon use, and links on Reddit. First Monday, 21(11).
Vickery, J. R. (2014). The curious case of Confession Bear: The reappropriation of online macro-image memes. Information, Communication & Society, 17(3), 301–325.
2019. Lexical Emergence on Reddit: An Analysis of Lexical Change on the “Front Page of the Internet”. Lexis :16
This list is based on CrossRef data as of 26 december 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.