Chapter published in:Applications of Pattern-driven Methods in Corpus Linguistics
Edited by Joanna Kopaczyk and Jukka Tyrkkö
[Studies in Corpus Linguistics 82] 2018
► pp. 277–310
Blogging around the world
Universal and localised patterns in Online Englishes
The borderless nature of blogging raises the question whether the traditional regionally defined varieties of English continue to hold true (see Crystal 2011). In order to investigate the extent to which the language published online without external intervention is similar around the world, this chapter investigates repetitive patterns, or 3-grams, found in blogs in the 583-million-word GloWbE corpus (Davies 2013). The data shows two types of repetitive word sequences: universal, or those that are frequent in all or most of the nineteen geographic locations represented in the corpus, and localised, or those unique to specific regions. We explore multiple ways of approaching the regional distribution of universal and localised 3-grams, such as statistical similarity measures (Jaccard coefficient and hierarchical clustering) and network visualisations. Three correlated research issues are addressed by this study: (1) the ratio of 3-grams in blogs from various World Englishes, which will shed light onto the degree of formulaicity in Web Englishes around the world; (2) the overlaps between various locations in terms of preferred sequences, which may point to local or global standardization hubs on the level of sentence and text construction; (3) finally, the status of model-providing varieties for internet communication, especially American English, in view of the most frequent 3-grams from other locations (cf. Mair 2013).
Keywords: World Englishes, blogs, GloWbE, hierarchical clustering, Gephi plot
Published online: 13 March 2018
Ädel, Annelie & Erman, Britt
Biber, Douglas & Barbieri, Federica
Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan & Finegan, Edward
British National Corpus (BNC XML Edition)
2007 Distributed by Oxford University Computing Services on behalf of the BNC Consortium. http://www.natcorp.ox.ac.uk/
2014 Corpus of Global Web-Based English: 1.9 billion words from speakers in 20 countries. http://corpus.byu.edu/glowbe/
Davies, Mark & Fuchs, Robert
de Swaan, Abram
Gries, Stefan T. & Mukherjee, Joybrato
Grieve, Jack, Douglas Biber, Eric Friginal & Tatiana Nekrasova
Gupta, Anthea Fraser
Hundt, Marianne & Gut, Ulrike
Internet World Stats. Usage and Population Statistics
. http://www.internetworldstats.com/stats.htm> (1 June 2017).
Internet Live Stats
. http://www.internetlivestats.com> (1 June 2017).
Jacomy, Mathieu, Venturini, Tommaso, Heymann, Sebastien & Bastian, Mathieu
Jucker, Andreas H. & Kopaczyk, Joanna
Mukherjee, Joybrato & Gries, Stefan T.
Oliveros, J. C.
2007–2015 Venny. An interactive tool for comparing lists with Venn’s diagrams. http://bioinfogp.cnb.csic.es/tools/venny/index.html
Richardson, Kay, Parry, Katy & Corner, John
Schneider, Gerold & Hundt, Marianne
Traugott, Elisabeth 2008 Grammaticalization, constructions and the incremental development of language: Suggestions from the development of degree modifiers in English. In Variation, Selection, Development. Probing the Evolutionary Model of Language Change, Regine Eckardt, Gerhard Jäger, and Tonjes Veenstra (eds), 219–250. Berlin: Mouton de Gruyter.
Tyrkkö, Jukka, Hickey, Raymond & Marttila, Ville
Warschauer, Mark, Black, Rebecca & Chou, Yen-Lin
Cited by 1 other publications
Weetman, Katharine, Jeremy Dale, Rachel Spencer, Emma Scott & Stephanie Schnurr
This list is based on CrossRef data as of 23 november 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.