Phylogenetic signal in phonotactics
Jayden L. Macklin-Cordes | The University of Queensland
Claire Bowern | Yale University
Erich R. Round | The University of Queensland | University of Surrey | Max Planck Institute for the Science of Human History
Phylogenetic methods have broad potential in linguistics beyond tree inference. Here, we show how a phylogenetic approach
opens the possibility of gaining historical insights from entirely new kinds of linguistic data – in this instance, statistical
phonotactics. We extract phonotactic data from 112 Pama-Nyungan vocabularies and apply tests for phylogenetic signal,
quantifying the degree to which the data reflect phylogenetic history. We test three datasets: (1) binary variables recording the presence
or absence of biphones (two-segment sequences) in a lexicon (2) frequencies of transitions between segments, and (3)
frequencies of transitions between natural sound classes. Australian languages have been characterized as having a high degree of
phonotactic homogeneity. Nevertheless, we detect phylogenetic signal in all datasets. Phylogenetic signal is greater in finer-grained
frequency data than in binary data, and greatest in natural-class-based data. These results demonstrate the viability of employing a new
source of readily extractable data in historical and comparative linguistics.
Keywords: historical signal, phylogenetic comparative methods, historical linguistics, phonology, comparative linguistics, linguistic phylogenetics, Pama-Nyungan, Australian languages
Article outline
- 1.Introduction
- 1.1Motivations
- 1.2Phonotactics as a source of historical signal
- 2.Phylogenetic signal
- 3.Materials
- 3.1Language sample
- 3.2Wordlists
- 3.3Reference phylogeny
- 4.Phylogenetic signal in binary phonotactic data
- 4.1Results for binary phonotactic data
- 4.2Robustness checks
- 5.Phylogenetic signal in continuous phonotactic data
- 5.1Robustness checks
- 5.2Forward transitions versus backward transitions
- 5.3Normalization of character values
- 6.Phylogenetic signal in natural-class-based characters
- 6.1Natural-class-based characters versus biphones
- 7.Discussion
- 7.1Overall robustness
- 7.2Limitations
- 8.Conclusion
- Acknowledgements
- Author contribution statement
- Notes
-
References
Available under the Creative Commons Attribution (CC BY) 4.0 license.
For any use beyond this license, please contact the publisher at rights@benjamins.nl.
Published online: 02 February 2021
https://doi.org/10.1075/dia.20004.mac
https://doi.org/10.1075/dia.20004.mac
References
Albright, Adam & Bruce Hayes
Alpher, Barry J.
Baker, Brett
Balisi, Mairin, Corinna Casey & Blaire Van Valkenburgh
Birchall, Joshua
Blasi, Damián E., Steven Moran, Scott R. Moisik, Paul Widmer, Dan Dediu & Balthasar Bickel
Blomberg, Simon P. & Theodore Garland Jr.
Blomberg, Simon P., Theodore Garland Jr. & Anthony R. Ives
Bouckaert, Remco R., Claire Bowern & Quentin D. Atkinson
Bowern, Claire
2016 Chirila: Contemporary and historical resources for the Indigenous languages of Australia. Language Documentation and Conservation 101. http://hdl.handle.net/10125/24685
Bowern, Claire & Quentin D. Atkinson
Bowern, Claire, Patience Epps, Russell Gray, Jane Hill, Keith Hunley, Patrick McConvell & Jason Zentz
Bowern, Claire & Harold Koch
Busby, Peter A.
Calude, Andreea S. & Annemarie Verkerk
Chang, Will, Chundra Cathcart, David Hall & Andrew Garrett
Chao, Yuen-Ren
Coleman, John & Janet Pierrehumbert
1997 Stochastic phonological grammars and acceptability. In Computational phonology: ACL special interest group in computational phonology, 49–56. Somerset, NJ: Association for Computational Linguistics. http://arxiv.org/abs/cmp-lg/9707017 (8 March, 2018).
Crawford, Clifford J.
2009 Adaptation and transmission in Japanese loanword phonology. Cornell University thesis. http://core.ac.uk/download/pdf/4912071.pdf
Cysouw, Michael & Jeff Good
2007 Towards a comprehensive languoid catalog. In Language catalogue meeting. Leipsig, Germany. http://cysouw.de/home/presentations_files/cysouwCATALOGUE_slides.pdf (11 December, 2019).
Delsuc, Frédéric, Henner Brinkmann & Hervé Philippe
Dixon, R. M. W.
Dockum, Rikker
Dockum, Rikker & Claire Bowern
2019 Swadesh lists are not long enough: Drawing phonological generalizations from limited data. Language Documentation and Description 161. 35–54. http://www.elpublishing.org/PID/168
Dresher, B. Elan
Dresher, B. Elan & Aditi Lahiri
Dunn, Michael, Tonya Kim Dewey, Carlee Arnett, Thórhallur Eythórsson & Jóhanna Barðdal
Dunn, Michael, Simon J. Greenhill, Stephen C. Levinson & Russell D. Gray
Dunn, Michael, Angela Terrill, Ger Reesink, Robert A. Foley & Stephen C. Levinson
Durie, Mark & Malcolm Ross
Eddington, David
Ernestus, Mirjam T. C. & R. Harald Baayen
Felsenstein, Joseph
Freckleton, Robert P., Paul H. Harvey & Mark Pagel
Freckleton, Robert P. & Walter Jetz
Fritz, Susanne A. & Andy Purvis
Garland, Theodore, Jr. & Ramón Díaz-Uriarte
Gasser, Emily & Claire Bowern
Good, Jeff & Michael Cysouw
2013 Languoid, doculect, and glossonym: Formalizing the notion ‘language’. Language Documentation and Conservation 71. 331–359. http://hdl.handle.net/10125/4606
Gordon, Matthew K.
Grafen, Alan
Greenhill, Simon J., Thomas E. Currie & Russell D. Gray
Greenhill, Simon J., Chieh-Hsi Wu, Xia Hua, Michael Dunn, Stephen C. Levinson & Russell D. Gray
Hamilton, Philip J.
Hayes, Bruce & Zsuzsa C. Londe
Hockett, Charles F.
Hutchinson, Matthew C., Marília P. Gaiarsa & Daniel B. Stouffer
Hyman, Larry M.
1970 The role of borrowing in the justification of phonological grammars. Studies in African Linguistics 1(1). 1–48. https://journals.linguisticsociety.org/elanguage/sal/article/view/927.html
Kang, Yoonjung
Kembel, Steven W., Peter D. Cowan, Mattew R. Helmus, William K. Cornwell, Helene Morlon, David D. Ackerly, Simon P. Blomberg & Campbell O. Webb
Kiparsky, Paul
Koch, Harold
Kolipakam, Vishnupriya, Fiona M. Jordan, Michael Dunn, Simon J. Greenhill, Remco R. Bouckaert, Russell D. Gray & Annemarie Verkerk
Lass, Roger
Leff, Jonathan W., Richard D. Bardgett, Anna Wilkinson, Benjamin G. Jackson, William J. Pritchard, Jonathan R. Long, Simon Oakley, et al.
List, Johann-Mattis, Simon J. Greenhill & Russell D. Gray
List, Johann-Mattis, Mary Walworth, Simon J. Greenhill, Tiago Tresoldi & Robert Forkel
Losos, Jonathan B.
Macklin-Cordes, Jayden L. & Erich R. Round
Marin, Julie, S. Blair Hedges & Koichiro Tamura
Maurits, Luke & Thomas L. Griffiths
Moran, Steven & Michael Cysouw
Moran, Steven, Eitan Grossman & Annemarie Verkerk
Moran, Steven & Annemarie Verkerk
2018 Differential rates of change in consonant and vowel systems. In C. Cuskley, M. Flaherty, H. Little, Luke McCrohon, A. Ravignani & T. Verhoef (eds.), The evolution of language (EVOLANGXII). NCU Press.
. http://evolang.org/torun/proceedings/papertemplate.html?p=98
Münkemüller, Tamara, Sébastien Lavergne, Bruno Bzeznik, Stéphane Dray, Thibaut Jombart, Katja Schiffers & Wilfried Thuiller
Nash, David, Patrick McConvell, Arthur Capell, Ken Hale, Peter Sutton, Deborah Bird Rose & Jim Wafer
1988 Mudburra wordlist. Word list. Canberra: Australian Institute of Aboriginal and Torres Strait Islander Studies, Australian Indigenous Languages Collection, ms. http://aiatsis.gov.au/sites/default/files/catalogue_resources/0031_access.zip
Nichols, Johanna
Nunn, Charles L.
O’Grady, Geoffrey N., Charles F. Voegelin & Florence M. Voegelin
Orme, David, Rob Freckleton, Gavin Thomas, Thomas Petzoldt, Susanne Fritz, Nick Isaac & Will Pearse
2013 caper: Comparative analyses of phylogenetics and evolution in R. https://CRAN.R-project.org/package=caper
Rama, Taraka, Johann-Mattis List, Johannes Wahle & Gerhard Jäger
2018 Are automatic methods for cognate detection good enough for phylogenetic reconstruction in historical linguistics? In North American chapter of the Association for Computational Linguistics (ACL): Human language technologies, volume 2 (short papers), 393–400. New Orleans: Association for Computational Linguistics. 

R Core Team
2017 R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. https://www.R-project.org/
Revell, Liam J., Luke J. Harmon, David C. Collar & Todd Oakley
Rexová, Kateřina, Yvonne Bastin & Daniel Frynta
Round, Erich R.
2017b The AusPhon-Lexicon project: 2 million normalized segments across 300 Australian languages. In Poznań linguistic meeting. Poznań, Poland. http://wa.amu.edu.pl/plm_old/2017/files/abstracts/PLM2017_Abstract_Round.pdf
Sallan, Lauren Cole & Matt Friedman
Schmidt, Wilhelm
Silverman, Daniel
Sookias, Roland B., Samuel Passmore & Quentin D. Atkinson
Steiner, Lydia, Michael Cysouw & Peter Stadler
Uyeda, Josef C., Rosana Zenil-Ferguson, Matthew W. Pennell & Nicholas Matzke
Van der Hulst, Harry
Verkerk, Annemarie
2017 Phylogenetic comparative methods for typologists (focusing on families and regions: A plea for using phylogenetic comparative methods in linguistic typology). In Quantitative analysis in typology: The logic of choice among methods (workshop at the 12th conference of the Association for Linguistic Typology). Canberra, Australia: Australian National University.
Villemereuil, Pierre de & Shinichi Nakagawa
Voegelin, Florence M., Stephen A. Wurm, Geoffrey O’Grady, Tokuichiro Matsuda & Charles F. Voegelin
Walker, Robert S. & Lincoln A. Ribeiro
Webb, Campbell O., David D. Ackerly, Mark A. McPeek & Michael J. Donoghue
Weiss, Michael
Widmer, Manuel, Sandra Auderset, Johanna Nichols, Paul Widmer & Balthasar Bickel
Wortley, Alexandra H., Paula J. Rudall, David J. Harris, Robert W. Scotland & Peter Linder
Wurm, Stephen A.
Zheng, Li, Anthony R. Ives, Theodore Garland Jr., Bret R. Larget, Yang Yu & Kunfang Cao
Zhou, Kevin & Claire Bowern