In text-linguistic register research, distributions of linguistic features across registers are theorized as having a functional relationship to the situational context. A strength of this approach is its focus on frequencies of linguistic features across texts/registers. Situational variables, by contrast, have not been measured with the same granularity. Only recently have text-linguistic researchers begun to treat situational characteristics as continuous variables that vary between registers, and also across texts within registers. In the current chapter, we discuss the theoretical foundation of this perspective and present two studies of register variation from a continuous situation perspective. For both, we present methods for coding situational variables as continuous as well as key findings facilitated by the continuous situation perspective.
While there is preliminary evidence about the importance of register in linguistic choice-making processes, systematic studies focusing on the interaction between register and language-internal constraints are lacking in variationist linguistics. This contribution sketches an ongoing project in which two well-understood grammatical alternations (dative alternation and future marker alternation) are analysed with variationist methods, focusing on the role of register defined at the intersection of mode (spoken vs written) and formality (formal vs informal). Probabilistic corpus models will be complemented with rating experiments to investigate to what extent they correlate with participants’ ratings, and to illustrate the importance of methodological diversity in investigating usage-based theories of grammar. We present corpus results of a case study on the dative alternation with give.
In Systemic Functional Linguistics (SFL), choices in relation to the initial elements of clauses or ‘Themes’ have been claimed as indicators of register, genre or text type (Vande Kopple 1991, Fries 1995, North 2005). This chapter tests this premise using a large-scale corpus-based analysis of Themes in written Present-day American English. The analysis includes samples from fifteen registers, with different target audiences, communicative purposes and stylometric features. Two major segmental approaches to Theme are tested here: Halliday’s ‘first (ideational) element’ definition and Berry’s (1995) ‘preverbal’ hypothesis, according to which the Theme extends up to either the first ideational element or the verb, respectively. Each of the Themes identified in the corpus according to these definitions is typified according to its syntactic function and systemic-functional (textual, interpersonal, experiential) status. The clustering of registers based on the category Theme reveals the ‘first-element’ approach is a plausible dissimilarity metric for registers, thus demonstrating that SFL Theme may be taken as a predictor of register categorization.
This paper investigates the relationship between the stylistic context of utterance production and the language user’s regional background as influencing factors in one syntactic alternation, i.e., variation between the double object and the prepositional dative construction. To that end, this chapter zooms in on (1) the competition between stylistic context and regional community regarding dative choice, (2) cross-regional inter-register variation, and (3) register-specific coherence (aka intra-register variation). Comparing data from nine varieties of English using corpora that presumably share the same structure (and registers) reveals that community is more important than context, that the effect of register is regionally variable and that registers are largely but not fully coherent. These findings do not only stress the variable nature of probabilistic grammars but also point to the importance of regional effects when studying register variation (all scripts at https://osf.io/3djkr/).
This chapter reports an exploration of dimensions of register variation across varieties of English. We analyse 2,844 texts from the Hong Kong, Jamaica and New Zealand components of the International Corpus of English, using its text categorization scheme as a frame of reference. We apply Geometric Multivariate Analysis, an interactive procedure for exploring latent structure in language variation, based on the frequencies of 41 lexico-grammatical features informed by systemic functional register theory. Visual inspection of the distribution of texts across the multidimensional space reveals continuities between groups of texts as well as dimensions of variation that can be related to theoretical register constructs. We also observe differences between the three ICE components (and their text categories) in register space.
Our study differs from previous studies on the relation between modifiers in the noun phrase and register in that we do pairwise comparisons of ten registers based on the frequency of a modifier form relative to the number of nouns in the register. Using effect size to measure the differences between any two registers, we find that the largest interregister differences correspond to distinctions made in previous multi-dimensional studies. Registers with high proportions of premodifiers tend to be informational and differ most from registers that are more ‘oral-involved’. Large differences in postnominal phrases and non-finite clauses correspond to the information-narrative distinction.
The language-educational literature seems to agree on the conversational nature of pop lyrics. Thus, they have been advocated as an authentic, easily accessible and highly motivating resource for EFL learning, especially suitable to introduce conversational features. The present study re-assesses the alleged conversationality of lyrics, which to date has been implicitly assumed rather than empirically tested. It relies on a purpose-built pop lyrics corpus and applies Multidimensional Analysis (MDA) to situate them relatively to other spoken and written registers. The MDA indicates that lyrics constitute a specialized register, aligning with both spoken and written registers. This implies that rather than merely representing a convenient EFL resource for the illustration of spoken features, lyrics should be analysed both in their own right and in contrast to other registers, opening avenues for addressing broader issues such as register and language awareness.
The present study explores the relative importance of register in learner writing vis-à-vis learner-internal factors such as first-language background. Using multi-dimensional analysis, the study looks at learner and native-speaker student writing from two registers (argumentative essays and research papers), in comparison to published scientific articles. The results show that while certain differences could be noted across first-language background, the main differences were found between the registers, stressing its importance as a moderating variable. Specifically, the research papers and the scientific articles were characterized by topic-focussed, factual descriptions, and the argumentative texts by a more personal style. The results thus highlight the importance of taking register into consideration in learner corpus research studies.
This chapter explores word-based nominalizations in Early Modern English, a crucial period in the expansion of the English vocabulary. Nine Romance and native suffixes are traced in eighteen registers, thus covering a wide variety of registers along the formal-informal and speech-written continua. Findings demonstrate that there is a strong correlation between informal, speech-related registers and a low frequency of nominalizations, although the communicative purpose of particular registers can also have an effect here. Suffix productivity is also addressed, with results showing that, the frequency of nominalizations in terms of types and tokens increases over time across registers, except trial proceedings. However, Romance suffixes are seen chiefly responsible for this, whereas most native suffixes lose productivity during the period.
By applying data-driven methods based on information theory, this study adds to previous work on the development of the scientific register by measuring the informativity of alternative phrasal structures shown to be involved in change in language use in 20th-century Scientific English. The analysis based on data-driven periodization shows compounds to be distinctive grammatical structures from the 1920s onwards in Proceedings A of the Royal Society of London. Compounds not only increase in frequency, but also show higher informativity than their less dense prepositional counterparts. Results also show that the lower the informativity of particular items, the more alternative, more informationally dense options might be favoured (e.g., of-phrases vs. compounds) – striving for communicative efficiency thus being one force shaping the scientific register.
This chapter explores sub-register variation in newspaper writing in the 19th century using two corpora extracted from the British Library Newspapers database, the most comprehensive collection of national and regional newspapers from the Victorian period. As an ‘agile’ (Hundt & Mair 1999) register, newspaper writing is well suited for tracing language change and investigating the interrelationship between language and culture. Frequency analysis of select linguistic features identifies systematic patterns of variation, which can be linked to the communicative functions of sub-registers. The chapter also critically reflects on the value of the database for corpus-based register analysis, especially on how the findings and interpretations are contingent on what sampling criteria are used and how the notion of (sub-)register is operationalized.
In text-linguistic register research, distributions of linguistic features across registers are theorized as having a functional relationship to the situational context. A strength of this approach is its focus on frequencies of linguistic features across texts/registers. Situational variables, by contrast, have not been measured with the same granularity. Only recently have text-linguistic researchers begun to treat situational characteristics as continuous variables that vary between registers, and also across texts within registers. In the current chapter, we discuss the theoretical foundation of this perspective and present two studies of register variation from a continuous situation perspective. For both, we present methods for coding situational variables as continuous as well as key findings facilitated by the continuous situation perspective.
While there is preliminary evidence about the importance of register in linguistic choice-making processes, systematic studies focusing on the interaction between register and language-internal constraints are lacking in variationist linguistics. This contribution sketches an ongoing project in which two well-understood grammatical alternations (dative alternation and future marker alternation) are analysed with variationist methods, focusing on the role of register defined at the intersection of mode (spoken vs written) and formality (formal vs informal). Probabilistic corpus models will be complemented with rating experiments to investigate to what extent they correlate with participants’ ratings, and to illustrate the importance of methodological diversity in investigating usage-based theories of grammar. We present corpus results of a case study on the dative alternation with give.
In Systemic Functional Linguistics (SFL), choices in relation to the initial elements of clauses or ‘Themes’ have been claimed as indicators of register, genre or text type (Vande Kopple 1991, Fries 1995, North 2005). This chapter tests this premise using a large-scale corpus-based analysis of Themes in written Present-day American English. The analysis includes samples from fifteen registers, with different target audiences, communicative purposes and stylometric features. Two major segmental approaches to Theme are tested here: Halliday’s ‘first (ideational) element’ definition and Berry’s (1995) ‘preverbal’ hypothesis, according to which the Theme extends up to either the first ideational element or the verb, respectively. Each of the Themes identified in the corpus according to these definitions is typified according to its syntactic function and systemic-functional (textual, interpersonal, experiential) status. The clustering of registers based on the category Theme reveals the ‘first-element’ approach is a plausible dissimilarity metric for registers, thus demonstrating that SFL Theme may be taken as a predictor of register categorization.
This paper investigates the relationship between the stylistic context of utterance production and the language user’s regional background as influencing factors in one syntactic alternation, i.e., variation between the double object and the prepositional dative construction. To that end, this chapter zooms in on (1) the competition between stylistic context and regional community regarding dative choice, (2) cross-regional inter-register variation, and (3) register-specific coherence (aka intra-register variation). Comparing data from nine varieties of English using corpora that presumably share the same structure (and registers) reveals that community is more important than context, that the effect of register is regionally variable and that registers are largely but not fully coherent. These findings do not only stress the variable nature of probabilistic grammars but also point to the importance of regional effects when studying register variation (all scripts at https://osf.io/3djkr/).
This chapter reports an exploration of dimensions of register variation across varieties of English. We analyse 2,844 texts from the Hong Kong, Jamaica and New Zealand components of the International Corpus of English, using its text categorization scheme as a frame of reference. We apply Geometric Multivariate Analysis, an interactive procedure for exploring latent structure in language variation, based on the frequencies of 41 lexico-grammatical features informed by systemic functional register theory. Visual inspection of the distribution of texts across the multidimensional space reveals continuities between groups of texts as well as dimensions of variation that can be related to theoretical register constructs. We also observe differences between the three ICE components (and their text categories) in register space.
Our study differs from previous studies on the relation between modifiers in the noun phrase and register in that we do pairwise comparisons of ten registers based on the frequency of a modifier form relative to the number of nouns in the register. Using effect size to measure the differences between any two registers, we find that the largest interregister differences correspond to distinctions made in previous multi-dimensional studies. Registers with high proportions of premodifiers tend to be informational and differ most from registers that are more ‘oral-involved’. Large differences in postnominal phrases and non-finite clauses correspond to the information-narrative distinction.
The language-educational literature seems to agree on the conversational nature of pop lyrics. Thus, they have been advocated as an authentic, easily accessible and highly motivating resource for EFL learning, especially suitable to introduce conversational features. The present study re-assesses the alleged conversationality of lyrics, which to date has been implicitly assumed rather than empirically tested. It relies on a purpose-built pop lyrics corpus and applies Multidimensional Analysis (MDA) to situate them relatively to other spoken and written registers. The MDA indicates that lyrics constitute a specialized register, aligning with both spoken and written registers. This implies that rather than merely representing a convenient EFL resource for the illustration of spoken features, lyrics should be analysed both in their own right and in contrast to other registers, opening avenues for addressing broader issues such as register and language awareness.
The present study explores the relative importance of register in learner writing vis-à-vis learner-internal factors such as first-language background. Using multi-dimensional analysis, the study looks at learner and native-speaker student writing from two registers (argumentative essays and research papers), in comparison to published scientific articles. The results show that while certain differences could be noted across first-language background, the main differences were found between the registers, stressing its importance as a moderating variable. Specifically, the research papers and the scientific articles were characterized by topic-focussed, factual descriptions, and the argumentative texts by a more personal style. The results thus highlight the importance of taking register into consideration in learner corpus research studies.
This chapter explores word-based nominalizations in Early Modern English, a crucial period in the expansion of the English vocabulary. Nine Romance and native suffixes are traced in eighteen registers, thus covering a wide variety of registers along the formal-informal and speech-written continua. Findings demonstrate that there is a strong correlation between informal, speech-related registers and a low frequency of nominalizations, although the communicative purpose of particular registers can also have an effect here. Suffix productivity is also addressed, with results showing that, the frequency of nominalizations in terms of types and tokens increases over time across registers, except trial proceedings. However, Romance suffixes are seen chiefly responsible for this, whereas most native suffixes lose productivity during the period.
By applying data-driven methods based on information theory, this study adds to previous work on the development of the scientific register by measuring the informativity of alternative phrasal structures shown to be involved in change in language use in 20th-century Scientific English. The analysis based on data-driven periodization shows compounds to be distinctive grammatical structures from the 1920s onwards in Proceedings A of the Royal Society of London. Compounds not only increase in frequency, but also show higher informativity than their less dense prepositional counterparts. Results also show that the lower the informativity of particular items, the more alternative, more informationally dense options might be favoured (e.g., of-phrases vs. compounds) – striving for communicative efficiency thus being one force shaping the scientific register.
This chapter explores sub-register variation in newspaper writing in the 19th century using two corpora extracted from the British Library Newspapers database, the most comprehensive collection of national and regional newspapers from the Victorian period. As an ‘agile’ (Hundt & Mair 1999) register, newspaper writing is well suited for tracing language change and investigating the interrelationship between language and culture. Frequency analysis of select linguistic features identifies systematic patterns of variation, which can be linked to the communicative functions of sub-registers. The chapter also critically reflects on the value of the database for corpus-based register analysis, especially on how the findings and interpretations are contingent on what sampling criteria are used and how the notion of (sub-)register is operationalized.