Edited by Isobelle Clarke and Jack Grieve
[Register Studies 4:2] 2022
► pp. 138–170
This paper introduces an initial text typology of social media posts from a multi-dimensional (MD) perspective. Text types are “[g]roupings of text that are similar in their linguistic form” (Biber 1989: 13). This text typology is based on a new MD analysis of social media messages presented in the paper. The corpus consists of 60,000 social media messages in English compiled from Facebook, Twitter, Instagram, Reddit, Telegram, and YouTube. After the texts were cleaned up, the corpus was tagged with the Biber Tagger and post-processed with the Biber Tag Count. Three dimensions of variation were determined, each representing an underlying parameter of variation. Once the texts were scored on each of the dimensions, a k-means cluster analysis was carried out, and the optimal number of clusters was determined using the Cubic Clustering Criterion statistic. A two-way typology was developed based on the dimensional characteristics of each cluster and on careful qualitative analysis of text samples.