In this paper I discuss how the notion of complexity can be defined and operationalized to serve as a concept in linguistic research domains like typology, historical linguistics and language contact and acquisition studies. Elaborating on earlier work (Kusters 2003) I argue that a relative notion of complexity is to be preferred over an absolute one. With such a substantial notion, I show that possible objections raised against the concept of complexity are not valid. I work this further out for complexity in verbal inflectional morphology. Finally I demonstrate some intricacies of complexity with examples from variation and change in Quechua varieties.
In this paper, I address theoretical and methodological issues in the cross-linguistic study of grammatical complexity. I identify two different approaches to complexity: the absolute one – complexity as an objective property of the system, and the relative one – complexity as cost/difficulty to language users. I discuss the usability of these approaches in typological studies of complexity. I then address some general problems concerning the comparison of languages in terms of overall complexity, and argue that in typological studies of complexity it is better to focus on specific domains that are comparable across languages. Next, I discuss a few general criteria for measuring complexity. Finally, I address the relationship between complexity and cross-linguistic rarity.
Starting from a view on language as a combinatorial and hierarchically organized system we assumed that a high syllable complexity would favour a high number of syllable types, which in turn would favour a high number of monosyllables. Relevant cross-linguistic correlations based on Menzerath’s (1954) data on monosyllables in eight Indo-European languages turned out to be statistically significant. A further attempt was made to conceptualize “semantic complexity” and to relate it to complexity in phonology, word formation, and word order. In English, for instance, the tendency to phonological complexity and monosyllabism is associated with a tendency to homonymy and polysemy, to rigid word order and idiomatic speech. The results are explained by complexity trade-offs between rather than within the subsystems of language.
Languages have often been claimed to trade off complexity in one area with simplicity in another. The present paper tests this claim with a complexity metric based on the functional load of different coding strategies (head/dependent marking and word order) that interact in core argument marking. Data from a sample of 50 languages showed that the functional use of word order had a statistically significant inverse dependency with the presence of morphological marking, especially with dependent marking. Most other dependencies were far from statistical significance and in fact provide evidence against the trade-off claim, leading to its rejection as a general all-encompassing principle. Overall, languages seem to adhere more strongly to distinctiveness than to economy.
The question of “linguistic complexity” is interesting and fruitful. Unfortunately, the intuitive meaning of “complexity” is not amenable to formal analysis. This paper discusses some proposed definitions and shows how complexity can be assessed in various frameworks. The results show that, as expected, languages are all about equally “complex,” but further that languages can and do differ reliably in their morphological and syntactic complexities along an intuitive continuum. I focus not only on the mathematical aspects of complexity, but on the psychological ones as well. Any claim about “complexity” is inherently about process, including an implicit description of the underlying cognitive machinery. By comparing different measures, one may better understand human language processing and similarly, understanding psycholinguistics may drive better measures.
How complex are isolating languages? The Compensation Hypothesis suggests that isolating languages make up for simpler morphology with greater complexity in other domains, such as syntax and semantics. This paper provides detailed argumentation against the Compensation Hypothesis. A cross-linguistic experiment measuring the complexity of compositional semantics shows that isolating languages rely more heavily on simple Associational Semantics, in which the interpretation of a combined expression is maximally vague or underdifferentiated, anything having to do with the interpretations of the constituent parts. In addition, it is argued that such vagueness is not necessarily resolved via recourse to context and a more complex pragmatics. Thus, it is concluded that isolating languages may indeed be of greater overall simplicity that their non-isolating counterparts.
Contrary to recent claims that highly analytic or isolating languages are simpler than synthetic languages, in large part due to lack of inflectional affixation in the former, I argue that although isolating Asian languages such as Hmong, Mandarin Chinese, and Thai may be economical in terms of inflection, they exhibit significantly more complex lexical patterns of particular types than more synthetic languages such as Polish and English in like contexts. Evidence includes the use of classifiers, reduplication, compounding, stylized four-part expressions, verb serialization, and other types of what I call “lexical elaboration.” This analysis has implications for the question “What is linguistic complexity?” as well as for the more basic and vexing question of “What is grammar?”
The paper discusses the relationship between cross-linguistic differences in grammatical resources and linguistic complexity. It is claimed that Sirionó (Tupí-Guaraní) lacks syntactic coordination as in English John and Mary are asleep. Instead, Sirionó employs a number of different strategies – the ‘with’ strategy, the list strategy, and the ‘also’ strategy – to make up for this. It is argued that one or more of these strategies may serve as a diachronic source of syntactic coordination. The development of syntactic coordination in a language exemplifies condensation processes in grammaticalization and increases complexity in the sense that a certain type of complex syntactic structure is introduced, and makes it possible to express in one syntactic unit what previously needed two or more.
I have argued in various presentations that it is inherent to natural grammars to maintain a considerable level of complexity over time: simplifications occur, but are counterbalanced by complexifications due to grammaticalization, reanalysis, and new patterns created by phonetic erosion. I argue that only extensive acquisition by adults makes grammars simplify to a significant overall degree. Creoles are the extreme case, but languages like English, Mandarin Chinese, Persian, and Indonesian are less complex than their sister languages to a degree that correlates with their extensive histories of non-native acquisition at certain points on their timelines. In this paper I address a few cases in Indonesia that challenge my stipulation. The grammatical simplicity of Riau Indonesian and the languages of East Timor is due to adult acquisition. Meanwhile, a few completely analytic languages on Flores suggest either that my stipulation must be taken as a tendency, or that we can take the nature of the languages as spurs for investigating sociological disruption in the past.
The paper builds on studies on Hungarian spoken outside Hungary (Fenyvesi (ed.) 2005), which show a change from synthetic to analytic expression in Hungarian in contact. It argues that a parameter of morphological complexity is helpful to account for most morphological changes. With one exception the changes follow the strategy of replicating use patterns (Heine & Kuteva 2005). Other changes arise by implication of a different typological system adopted by the new varieties of Hungarian (De Groot 2005a). A detailed comparison between Hungarian inside and outside Hungary in terms of linguistic complexity (Dahl 2004) confirm to the idea that languages in contact become linguistically more complex. The paper furthermore discusses the interaction between typology, language change by contact, and complexity.
This paper explores the related but distinct issues of linguistic complexity and difficulty, as from the viewpoint of an adult learner. Language complexity is seen as an objective property of a system, which could in principle be computed mathematically, while difficulty is grounded in the particular person who experiences the difficulty, involving factors such as the linguistic categories present and the nature of their marking in the learner’s own language. This reasoning will be illustrated with one non-Austronesian language, Kuot, and its three Austronesian neighbours, Nalik, Notsi and Madak, of north-central New Ireland, Papua New Guinea.
We investigate the complexity of nominal plural allomorphy in ten Germanic languages from a contrastive and diachronic perspective. Focusing on one language family allows us to develop multidimensional criteria to measure morphological complexity and to compare different diachronical drifts. We introduce a three-step complexity metric, involving (1) a quantitative step, (2) a qualitative step, and (3) a validation step comparing the results from step (1) and (2) to actual language use. In this article, we apply the method’s two first steps to the plural allomorphy of our sample languages. Our criteria include for (1) the number of allomorphs and for (2) iconicity in form-meaning relationship, the basis of allomorph assignment, and the direction of determination between stem and suffix. Our approach reveals Faroese as the most complex language and English as the simplest one.
This paper discusses the possibility of quantifying complexity in languages in general, and in creoles in particular. It argues that creoles are indeed different from non-creoles, primarily in being less complex. While this has been argued before, this is the first attempt to prove it through the use of an extensive typological database. It is noteworthy that the diff ering complexity is not related to the relative lack of morphology in creoles, since they are also simpler than analytical languages. Finally, the parallels between pidgins and creoles (and in particular the fact that languages sociologically intermediate between the two categories are also structurally intermediate) support the increasingly questioned belief that pidgins are born out of pidgins.
This paper defines and surveys numeral systems from languages across the world. We define the complexity of a numeral system in some detail and give examples of varying complexity from different languages. The examples are chosen to illustrate the bounds on complexity that actually occur in natural languages and to delineate tricky issues of analysis. Then we contrast the complexity in numeral systems of pidgin/creole languages versus their lexifiers and versus languages generally in the world. It turns out that pidgins/creoles have slightly less complex numeral systems than their lexifiers, but probably still more complex than the world average. However, the conclusions in this respect are limited by gaps in documentation and unsystematic knowledge of the linguistic and social history of alleged pidgin/creole languages.
In this study, we apply the morphosyntactic 4-M model developed by Myers-Scotton and Jake (2000a, b) to data from Kabuverdianu or Cape Verdean Creole Portuguese (CVC) which has been less strongly restructured than so-called prototypical creoles.1 We focus on nominal plural marking where CVC presents similar morphosyntactic configurations as Brazilian Vernacular Portuguese (BVP) and the Portuguese spoken by the “Tongas”. Previous accounts of CVC and BVP nominal plural marking mention the occurrence of (at least) one inflectional marker per NP. We argue that the reduction of inflectional plural marking in CVC constitutes a case of overall loss of morphosyntactic complexity which is due to CVC having arisen through substantial reduction and restructuring during creolisation and to having shallow time-depth of existence in comparison to older languages, e.g., its lexifier Portuguese. We also argue that 4-M theory may constitute a useful diagnostic tool for the prediction of the configurations of complexity vs. simplification in cases of language reduction.
I examine the ways the minimal lexicon of a pidgin language, Chinook Jargon, gains maximal efficiency when put into use in a contemporary fictional text. The paper first describes the lexicon used from a structural point of view. It then examines the use of multifunctional lexical items in comparison to English. The results of these studies show, that 1) there is no bound morphology (neither derivational nor inflectional) in the variety studied and, 2) there is much more multifunctionality in the pidgin text than in the English texts. Finally, it is argued that the results show that the lexicon studied can indeed be described as simple and efficient.
In this paper I discuss how the notion of complexity can be defined and operationalized to serve as a concept in linguistic research domains like typology, historical linguistics and language contact and acquisition studies. Elaborating on earlier work (Kusters 2003) I argue that a relative notion of complexity is to be preferred over an absolute one. With such a substantial notion, I show that possible objections raised against the concept of complexity are not valid. I work this further out for complexity in verbal inflectional morphology. Finally I demonstrate some intricacies of complexity with examples from variation and change in Quechua varieties.
In this paper, I address theoretical and methodological issues in the cross-linguistic study of grammatical complexity. I identify two different approaches to complexity: the absolute one – complexity as an objective property of the system, and the relative one – complexity as cost/difficulty to language users. I discuss the usability of these approaches in typological studies of complexity. I then address some general problems concerning the comparison of languages in terms of overall complexity, and argue that in typological studies of complexity it is better to focus on specific domains that are comparable across languages. Next, I discuss a few general criteria for measuring complexity. Finally, I address the relationship between complexity and cross-linguistic rarity.
Starting from a view on language as a combinatorial and hierarchically organized system we assumed that a high syllable complexity would favour a high number of syllable types, which in turn would favour a high number of monosyllables. Relevant cross-linguistic correlations based on Menzerath’s (1954) data on monosyllables in eight Indo-European languages turned out to be statistically significant. A further attempt was made to conceptualize “semantic complexity” and to relate it to complexity in phonology, word formation, and word order. In English, for instance, the tendency to phonological complexity and monosyllabism is associated with a tendency to homonymy and polysemy, to rigid word order and idiomatic speech. The results are explained by complexity trade-offs between rather than within the subsystems of language.
Languages have often been claimed to trade off complexity in one area with simplicity in another. The present paper tests this claim with a complexity metric based on the functional load of different coding strategies (head/dependent marking and word order) that interact in core argument marking. Data from a sample of 50 languages showed that the functional use of word order had a statistically significant inverse dependency with the presence of morphological marking, especially with dependent marking. Most other dependencies were far from statistical significance and in fact provide evidence against the trade-off claim, leading to its rejection as a general all-encompassing principle. Overall, languages seem to adhere more strongly to distinctiveness than to economy.
The question of “linguistic complexity” is interesting and fruitful. Unfortunately, the intuitive meaning of “complexity” is not amenable to formal analysis. This paper discusses some proposed definitions and shows how complexity can be assessed in various frameworks. The results show that, as expected, languages are all about equally “complex,” but further that languages can and do differ reliably in their morphological and syntactic complexities along an intuitive continuum. I focus not only on the mathematical aspects of complexity, but on the psychological ones as well. Any claim about “complexity” is inherently about process, including an implicit description of the underlying cognitive machinery. By comparing different measures, one may better understand human language processing and similarly, understanding psycholinguistics may drive better measures.
How complex are isolating languages? The Compensation Hypothesis suggests that isolating languages make up for simpler morphology with greater complexity in other domains, such as syntax and semantics. This paper provides detailed argumentation against the Compensation Hypothesis. A cross-linguistic experiment measuring the complexity of compositional semantics shows that isolating languages rely more heavily on simple Associational Semantics, in which the interpretation of a combined expression is maximally vague or underdifferentiated, anything having to do with the interpretations of the constituent parts. In addition, it is argued that such vagueness is not necessarily resolved via recourse to context and a more complex pragmatics. Thus, it is concluded that isolating languages may indeed be of greater overall simplicity that their non-isolating counterparts.
Contrary to recent claims that highly analytic or isolating languages are simpler than synthetic languages, in large part due to lack of inflectional affixation in the former, I argue that although isolating Asian languages such as Hmong, Mandarin Chinese, and Thai may be economical in terms of inflection, they exhibit significantly more complex lexical patterns of particular types than more synthetic languages such as Polish and English in like contexts. Evidence includes the use of classifiers, reduplication, compounding, stylized four-part expressions, verb serialization, and other types of what I call “lexical elaboration.” This analysis has implications for the question “What is linguistic complexity?” as well as for the more basic and vexing question of “What is grammar?”
The paper discusses the relationship between cross-linguistic differences in grammatical resources and linguistic complexity. It is claimed that Sirionó (Tupí-Guaraní) lacks syntactic coordination as in English John and Mary are asleep. Instead, Sirionó employs a number of different strategies – the ‘with’ strategy, the list strategy, and the ‘also’ strategy – to make up for this. It is argued that one or more of these strategies may serve as a diachronic source of syntactic coordination. The development of syntactic coordination in a language exemplifies condensation processes in grammaticalization and increases complexity in the sense that a certain type of complex syntactic structure is introduced, and makes it possible to express in one syntactic unit what previously needed two or more.
I have argued in various presentations that it is inherent to natural grammars to maintain a considerable level of complexity over time: simplifications occur, but are counterbalanced by complexifications due to grammaticalization, reanalysis, and new patterns created by phonetic erosion. I argue that only extensive acquisition by adults makes grammars simplify to a significant overall degree. Creoles are the extreme case, but languages like English, Mandarin Chinese, Persian, and Indonesian are less complex than their sister languages to a degree that correlates with their extensive histories of non-native acquisition at certain points on their timelines. In this paper I address a few cases in Indonesia that challenge my stipulation. The grammatical simplicity of Riau Indonesian and the languages of East Timor is due to adult acquisition. Meanwhile, a few completely analytic languages on Flores suggest either that my stipulation must be taken as a tendency, or that we can take the nature of the languages as spurs for investigating sociological disruption in the past.
The paper builds on studies on Hungarian spoken outside Hungary (Fenyvesi (ed.) 2005), which show a change from synthetic to analytic expression in Hungarian in contact. It argues that a parameter of morphological complexity is helpful to account for most morphological changes. With one exception the changes follow the strategy of replicating use patterns (Heine & Kuteva 2005). Other changes arise by implication of a different typological system adopted by the new varieties of Hungarian (De Groot 2005a). A detailed comparison between Hungarian inside and outside Hungary in terms of linguistic complexity (Dahl 2004) confirm to the idea that languages in contact become linguistically more complex. The paper furthermore discusses the interaction between typology, language change by contact, and complexity.
This paper explores the related but distinct issues of linguistic complexity and difficulty, as from the viewpoint of an adult learner. Language complexity is seen as an objective property of a system, which could in principle be computed mathematically, while difficulty is grounded in the particular person who experiences the difficulty, involving factors such as the linguistic categories present and the nature of their marking in the learner’s own language. This reasoning will be illustrated with one non-Austronesian language, Kuot, and its three Austronesian neighbours, Nalik, Notsi and Madak, of north-central New Ireland, Papua New Guinea.
We investigate the complexity of nominal plural allomorphy in ten Germanic languages from a contrastive and diachronic perspective. Focusing on one language family allows us to develop multidimensional criteria to measure morphological complexity and to compare different diachronical drifts. We introduce a three-step complexity metric, involving (1) a quantitative step, (2) a qualitative step, and (3) a validation step comparing the results from step (1) and (2) to actual language use. In this article, we apply the method’s two first steps to the plural allomorphy of our sample languages. Our criteria include for (1) the number of allomorphs and for (2) iconicity in form-meaning relationship, the basis of allomorph assignment, and the direction of determination between stem and suffix. Our approach reveals Faroese as the most complex language and English as the simplest one.
This paper discusses the possibility of quantifying complexity in languages in general, and in creoles in particular. It argues that creoles are indeed different from non-creoles, primarily in being less complex. While this has been argued before, this is the first attempt to prove it through the use of an extensive typological database. It is noteworthy that the diff ering complexity is not related to the relative lack of morphology in creoles, since they are also simpler than analytical languages. Finally, the parallels between pidgins and creoles (and in particular the fact that languages sociologically intermediate between the two categories are also structurally intermediate) support the increasingly questioned belief that pidgins are born out of pidgins.
This paper defines and surveys numeral systems from languages across the world. We define the complexity of a numeral system in some detail and give examples of varying complexity from different languages. The examples are chosen to illustrate the bounds on complexity that actually occur in natural languages and to delineate tricky issues of analysis. Then we contrast the complexity in numeral systems of pidgin/creole languages versus their lexifiers and versus languages generally in the world. It turns out that pidgins/creoles have slightly less complex numeral systems than their lexifiers, but probably still more complex than the world average. However, the conclusions in this respect are limited by gaps in documentation and unsystematic knowledge of the linguistic and social history of alleged pidgin/creole languages.
In this study, we apply the morphosyntactic 4-M model developed by Myers-Scotton and Jake (2000a, b) to data from Kabuverdianu or Cape Verdean Creole Portuguese (CVC) which has been less strongly restructured than so-called prototypical creoles.1 We focus on nominal plural marking where CVC presents similar morphosyntactic configurations as Brazilian Vernacular Portuguese (BVP) and the Portuguese spoken by the “Tongas”. Previous accounts of CVC and BVP nominal plural marking mention the occurrence of (at least) one inflectional marker per NP. We argue that the reduction of inflectional plural marking in CVC constitutes a case of overall loss of morphosyntactic complexity which is due to CVC having arisen through substantial reduction and restructuring during creolisation and to having shallow time-depth of existence in comparison to older languages, e.g., its lexifier Portuguese. We also argue that 4-M theory may constitute a useful diagnostic tool for the prediction of the configurations of complexity vs. simplification in cases of language reduction.
I examine the ways the minimal lexicon of a pidgin language, Chinook Jargon, gains maximal efficiency when put into use in a contemporary fictional text. The paper first describes the lexicon used from a structural point of view. It then examines the use of multifunctional lexical items in comparison to English. The results of these studies show, that 1) there is no bound morphology (neither derivational nor inflectional) in the variety studied and, 2) there is much more multifunctionality in the pidgin text than in the English texts. Finally, it is argued that the results show that the lexicon studied can indeed be described as simple and efficient.