Vol. 38:2 (2021) ► pp.210–258
Phylogenetic signal in phonotactics
Phylogenetic methods have broad potential in linguistics beyond tree inference. Here, we show how a phylogenetic approach opens the possibility of gaining historical insights from entirely new kinds of linguistic data – in this instance, statistical phonotactics. We extract phonotactic data from 112 Pama-Nyungan vocabularies and apply tests for phylogenetic signal, quantifying the degree to which the data reflect phylogenetic history. We test three datasets: (1) binary variables recording the presence or absence of biphones (two-segment sequences) in a lexicon (2) frequencies of transitions between segments, and (3) frequencies of transitions between natural sound classes. Australian languages have been characterized as having a high degree of phonotactic homogeneity. Nevertheless, we detect phylogenetic signal in all datasets. Phylogenetic signal is greater in finer-grained frequency data than in binary data, and greatest in natural-class-based data. These results demonstrate the viability of employing a new source of readily extractable data in historical and comparative linguistics.
Article outline
- 1.Introduction
- 1.1Motivations
- 1.2Phonotactics as a source of historical signal
- 2.Phylogenetic signal
- 3.Materials
- 3.1Language sample
- 3.2Wordlists
- 3.3Reference phylogeny
- 4.Phylogenetic signal in binary phonotactic data
- 4.1Results for binary phonotactic data
- 4.2Robustness checks
- 5.Phylogenetic signal in continuous phonotactic data
- 5.1Robustness checks
- 5.2Forward transitions versus backward transitions
- 5.3Normalization of character values
- 6.Phylogenetic signal in natural-class-based characters
- 6.1Natural-class-based characters versus biphones
- 7.Discussion
- 7.1Overall robustness
- 7.2Limitations
- 8.Conclusion
- Acknowledgements
- Author contribution statement
- Supplementary materials
- Notes
-
References
For any use beyond this license, please contact the publisher at rights@benjamins.nl.