Chapter published in:
Corpus Approaches to Social Media
Edited by Sofia Rüdiger and Daria Dayter
[Studies in Corpus Linguistics 98] 2020
► pp. 111130
Baumgartner, Jason
n.d. Reddit Comment Corpus. pushshift​.io (27 March 2020).
Berber Sardinha, Tony
2014Comparing internet and pre-internet registers. In Multi-dimensional Analysis, 25 Years on: A Tribute to Douglas Biber [Studies in Corpus Linguistics 60], Tony Berber-Sardinha & Marcia Veirano-Pinto (eds), 81–105. Amsterdam: John Benjamins. CrossrefGoogle Scholar
Biber, Douglas
1988Variation across Speech and Writing. Cambridge: CUP. CrossrefGoogle Scholar
1992The multi-dimensional approach to linguistic analyses of genre variation: An overview of methodology and findings. Computers and the Humanities 26(5–6): 331–345. CrossrefGoogle Scholar
1993Representativeness in corpus design. Literary and Linguistic Computing 8(4): 243–257. CrossrefGoogle Scholar
2014Using multi-dimensional analysis to explore cross-linguistic universals of register variation. Languages in Contrast 14(1): 7–34. CrossrefGoogle Scholar
Biber, Douglas & Egbert, Jesse
2016Register variation on the searchable web: A multi-dimensional analysis. Journal of English Linguistics 44(2): 95–137. CrossrefGoogle Scholar
Clarke, Isobelle & Grieve, Jack
2017Dimensions of abusive language on Twitter. In Proceedings of the First Workshop on Abusive Language Online, Zeerak Waseem, Wendy Hui Kyong, Dirk Hovy & Joel Tetreault (eds), 1–10. Vancouver BC: Association for Computational Linguistics. CrossrefGoogle Scholar
2019Stylistic variation on the Donald Trump Twitter account: A linguistic analysis of tweets posted between 2009 and 2018. PLoS ONE 14(9). CrossrefGoogle Scholar
Conrad, Susan & Biber, Douglas
2001Variation in English: Multi-dimensional Studies. Eastbourne: Pearson Education.Google Scholar
Covington, Michael A. & McFall, Joe D.
2010Cutting the Gordian Knot: The Moving-Average Type-Token Ratio (MATTR). Journal of Quantitative Linguistics 17(2): 94–100. CrossrefGoogle Scholar
Eisenstein, Jacob
2013What to do about bad language on the internet. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), 359–369.Google Scholar
Francis, W. Nelson & Kučera, Henry
1964A Standard Corpus of Present-Day Edited American English, for Use with Digital Computers (Brown). Providence, RI: Brown University.Google Scholar
Hess, Carla W., Sefton, Karem M. & Landry, Richard G.
1986Sample size and type-token ratios for oral language of preschool children. Journal of Speech and Hearing Research 29: 129–134. CrossrefGoogle Scholar
Hess, Carla W., Haug, Holly T. & Landry, Richard G.
1989The reliability of type-token ratios for the oral language of school age children. Journal of Speech and Hearing Research 32: 536–540. CrossrefGoogle Scholar
Hiltunen, Turo
2014Choice of national variety in the English-language Wikipedia. In Texts and Discourses of New Media, Jukka Tyrkkö & Sirpa Leppänen (eds), n.p. Helsinki: VARIENG. http://​www​.helsinki​.fi​/varieng​/series​/volumes​/15​/hiltunen/ (8 June 2020).
Koizumi, Rie & In’nami, Yo
2012Effects of text length on lexical diversity measures: Using short texts with less than 200 tokens. System 40(4): 554–564. CrossrefGoogle Scholar
Kubát, Miroslav & Milička, Jiří
2013Vocabulary richness measure in genres. Journal of Quantitative Linguistics 20(4): 339–349. CrossrefGoogle Scholar
Liimatta, Aatu
2019Exploring register variation on Reddit: A multi-dimensional study of language use on a social media website. Register Studies 1(2): 269–295. CrossrefGoogle Scholar
Rosen, Aliza
2017Tweeting made easier. Twitter Blog, 7 November 2017, https://​blog​.twitter​.com​/en​_us​/topics​/product​/2017​/tweetingmadeeasier​.html (5 February 2020).
Titak, Ashley & Roberson, Audrey
2013Dimensions of web registers: An exploratory multi-dimensional comparison. Corpora 8(2): 239–271. CrossrefGoogle Scholar
Vitter, Jeffrey Scott
1985Random sampling with a reservoir. ACM Transactions on Mathematical Software 11(1): 37–57. CrossrefGoogle Scholar