Uralic typology in the light of a new comprehensive dataset
This paper presents the Uralic Areal Typology Online (UraTyp 1.0), a typological dataset of 35
Uralic languages and a total of 360 features, mainly covering the levels of morphology, syntax, and phonology. The features belong
to two different datasets: 195 features’ definitions originate from the Grambank (GB) database, developed for comparison of world
language typology, whereas 165 features (UT) have been designed specifically to describe the typological variation within the
Uralic language family. We present a series of analyses of the dataset demonstrating its scope and possibilities. The complete
data set correctly identifies the main Uralic subgroups in a Principal Components Analysis, whereas GB data alone is
insufficiently granular to detect this family-internal structure. Similar analyses limited to various typological subdomains also
give variable results. A model-based admixture analysis identifies four distinct areas of historical interaction: Saami, Finnic,
the Volga area and Ob-Ugric.
Article outline
- 1.Introduction
- 2.UraTyp & Uralic languages in Grambank
- 2.1Previous systematic documentation of Uralic typological diversity
- 2.2Creating the UraTyp database
- 2.2.1Uralic languages in Grambank
- 2.2.2Defining the UT features
- 2.2.3Coding the Uralic languages
- 2.2.4Combining GB and UT data into UraTyp
- 2.3Data availability
- 3.Statistical analyses of the UraTyp data
- 3.1Overview of the variation in UraTyp data
- 3.2Clustering UraTyp and its subsets with PCA
- 3.3What distinguishes Uralic subfamilies?
- 3.4Diachronic patterns of typological admixture
- 4.Discussion
- 5.Conclusions and future perspectives
- Acknowledgments
-
References
References (65)
References
Abondolo, Daniel Mario (ed.). 1998. The
Uralic languages (Routledge Language Family Descriptions). London, New York: Routledge.
Aikio, Ante (Luobbal Sámmol Sámmol Ánte). 2012. An essay on Saami
ethnolinguistic prehistory. In Riho Grünthal & Petri Kallio (eds.), A
linguistic map of prehistoric northern Europe (MSFOu
266), 63–118. Helsinki: Finno-Ugrian Society.
Aikio, Ante. 2018. Notes
on the development of some consonant clusters in Hungarian. In Sampsa Holopainen & Janne Saarikivi (eds.), Περὶ o̓ρθότητος ἐτύμων – Uusiutuva uralilainen etymologia [On the
correctness of etymologies – Renewed Uralic etymology]. (Studia Uralica Helsingiensia
11), 77–90. Helsinki: Finno-Ugrian Society.
Alexander, David H., John Novembre & Kenneth Lange. 2009. Fast
model-based estimation of ancestry in unrelated individuals. Genome
Research 191. 1655–1664.
Bereczki, Gábor. 1977. Permi-cseremisz lexikális kölcsönzések [Permic–Mari lexical
borrowings]. Nyelvtudományi
Közlemények 791. 57–76.
Bereczki, Gábor. 1984. Die Beziehungen zwischen den finnougrischen und türkischen Sprachen im
Wolga–Kama-Gebiet [Relations between the Finno-Ugric and Turkic languages in
the Volga-Kama region]. Nyelvtudományi
Közlemények 861. 307–314.
Ceolin, Andrea, Cristina Guardiano, Monica Alexandrina Irimia & Giuseppe Longobardi. 2020. Formal
syntax and deep history. Frontiers in
Psychology 111.
Csepregi, Márta & Katalin Gugán. to
appear. The syntax of
Khanty. Manuscript, Research Centre for Linguistics, Hungary ([URL]) (Accessed 21-12-2021.)
Csúcs, Sándor. 1990. Die tatarischen Lehnwörter des Wotjakischen [The Tatar loanwords of
Votyak]. Budapest: Akadémiai Kiadó.
Dahl, Östen & Viveka Velupillai. 2013. The
past tense. In Matthew S. Dryer & Martin Haspelmath (eds.), The
World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology. ([URL]) (Accessed 04-04-2021.)
Dediu, Dan & Stephen C. Levinson. 2012. Abstract
profiles of structural stability point to universal tendencies, family-specific factors, and ancient connections between
languages. In Alex Mesoudi (ed.), PLoS
ONE 7(9). e45198.
Dryer, Matthew S. & Martin Haspelmath (eds.). 2013. WALS
Online. Leipzig: Max Planck Institute for Evolutionary Anthropology. ([URL]) (Accessed 03-11-2018)
Forkel, Robert, Sebastian Bank, Christoph Rzymski & Hans-Jörg Bibiko. 2020. clld/clld:
clld – a toolkit for cross-linguistic databases
(v7.2.0). Zenodo.
Forkel, Robert & Johann-Mattis List. 2020. CLDFBench:
Give your cross-linguistic data a lift. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck et al. (eds.), Proceedings
of the 12th Conference on Language Resources and Evaluation (LREC
2020), 6995–7002. Paris: European Language Resources Association (ELRA).
Forkel, Robert, Johann-Mattis List, Simon J. Greenhill, Christoph Rzymski, Sebastian Bank, Michael Cysouw, Harald Hammarström, Martin Haspelmath, Gereon A. Kaiping & Russell D. Gray. 2018. Cross-linguistic
data formats, advancing data sharing and re-use in comparative linguistics. Scientific
Data 5(1). 180205.
François, Olivier. 2016. Running
structure-like population genetic analyses with R. R tutorials in population
genetics, University of Grenoble-Alpes, 1–9.
Frichot, Eric & Olivier François. 2015. LEA:
An R package for landscape and ecological association studies. Methods in Ecology and
Evolution 6(8). 925–929.
Good, Jeff & Michael Cysouw. 2013. Languoid,
doculect, and glossonym: Formalizing the notion “language”. Language Documentation &
Conservation 71. ([URL]) (Accessed 31-08-2021.)
Greenhill, Simon J., Q. D. Atkinson, A. Meade & Russell D. Gray. 2010. The
shape and tempo of language evolution. Proceedings of the Royal Society B: Biological
Sciences 277(1693). 2443–2450.
Greenhill, Simon J., Paul Heggarty & Russell D. Gray. 2020. Bayesian
phylolinguistics. In R. D. Janda, B. D. Joseph & B. S. Vance (eds.), The
handbook of historical
linguistics, vol. 21, 226–253. New Jersey: Wiley-Blackwell.
Grünthal, Riho. 2015. Livonian
at the crossroads of language contacts. In Santeri Junttila (ed.), Baltic
languages and white nights (Uralica Helsingiensia
7), 12–67. Helsinki: Suomalais-Ugrilainen Seura.
Grünthal, Riho. 2019. Canonical
and non-canonical patterns in the adpositional phrase in Western Uralic: Constraints on
borrowing. SUSA/JSFOu 971. 9–34.
Gulya, János. 1977. Megjegyzések az ugor őshaza és az ugor nyelvek szétválása kérdéséről [Comments on the issue of the separation of the Ugric homeland and the Ugric
languages]. In Bartha, Antal et al. (eds.), Magyar
őstörténeti
tanulmányok, 115–121. Budapest: Akadémiai Kiadó.
Hajdú, Péter. 1952. Az ugor kor helyének és idejének kérdéséhez [On the question
of the place and time of the Ugric age]. Nyelvtudományi
Közlemények 541. 264–269.
Hammarström, Harald, Robert Forkel, Martin Haspelmath & Sebastian Bank. 2021. Glottolog
4.4. Leipzig: Max Planck Institute for Evolutionary Anthropology. , available
online at [URL] (Accessed 31-08-2021.)
Haspelmath, Martin. 2001. The
European linguistic area: Standard Average European. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language
typology and language universals (Handbücher zur Sprach- und Kommunikationswissenschaft,
20.2), 1492–1510. Berlin: Mouton de Gruyter.
Hausenberg, Anu-Reet & Paul, Kokla. 1988. Unificirovannaja sistema opisanija dialektov v primenenii k komi i marijskim glagolʹnym
formam [A unified system applied in dialect description of Komi and Mari
verb forms]. Soviet Finno-Ugric
Studies 241. 19–26.
Havas, Ferenc. 2010. The
Uralic typology database project. Paper presented at the Eleventh
International Congress of Finno-Ugric Studies, Piliscsaba,
Hungary, 9–14 August 2010. ([URL]) (Accessed 28-11-2021.)
Havas, Ferenc, Márta Csepregi, Nikolett F. Gulyás & Szilvia Németh. 2015. Typological
Database of the Ugric Languages. Budapest: ELTE Finnugor Tanszék. ([URL]) (Accessed 09-06-2021.)
Heikkilä, Mikko. 2011. Huomioita kantasaamen ajoittamisesta ja paikantamisesta sekä germaanisia etymologioita suomalais-saamelaisille
sanoille [Remarks on the timing and location of the native Sámi and Germanic
etymologies for Finnish-Sámi
words]. Virittäjä 11. 68–82.
Helimski, Eugene. 2003. Areal
groupings (Sprachbünde) within and across the borders of the Uralic language family: a
survey. Nyelvtudományi
Közlemények 1001. 156–167.
Honkola, Terhi, Outi Vesakoski, Kalle Korhonen, Jüri Lehtinen, Kaj Syrjänen & Niklas Wahlberg. 2013. Cultural
and climatic changes shape the evolutionary history of the Uralic languages. Journal of
Evolutionary
Biology 261. 1244–1253.
Honkola, Terhi, Kalle Ruokolainen, Kaj Syrjänen, Unni-Päivä Leino, Ilpo Tammi, Niklas Wahlberg & Outi Vesakoski. 2018. Evolution
within a language: environmental differences contribute to divergence of dialect groups. BMC
Evolutionary Biology 18(1), [132].
Honkola, Terhi, Jenni Santaharju, Kaj Syrjänen & Karl Pajusalu. 2019. Clustering
lexical variation of Finnic languages, based on Atlas Linguarum Fennicarum. Linguistica
Uralica 55(3). 161–184.
Honti, László. 1979. Features
of Ugric languages (Observations on the question of Ugric unity). Acta Linguistica Academia
Scientiarum
Hungaricae 291. 1–25.
Honti, László. 1997. Az ugor alapnyelv kérdéséhez [On the question of the Ugric
protolanguage]. (Budapesti Finnugor Füzetek
7). Budapest: ELTE BTK Finnugor Tanszék.
Isanbaev, Nikolaj Isanbaevič. 1994. Marijsko-tjurkskie jazykovye kontakty. Častʹ vtoraja. [Mari-Turkic language
contacts. Part Two.] Joškar-Ola: Marijskij naučno-issledovatelʹskij institut jazyka, literatury i istorii im. V. M. Vasilʹeva.
Johanson, Lars. 2000. Linguistic
convergence in the Volga area. In Dicky Gilberts, John A. Nerbonne & Jos Schaecken (eds.), Languages
in contact (Studies in Slavic and General Linguistics
28), 165–178. Leiden: Brill.
Klumpp, Gerson, Lidia Federica Mazzitelli & Fedor Rozhanskiy. 2018. Typology
of Uralic languages: Current views and new perspectives. Introduction to the special issue of ESUKA –
JEFUL. Eesti ja soome-ugri keeleteaduse ajakiri. Journal of Estonian and Finno-Ugric
Linguistics 9 (1). 9–30.
Koptjevskaja-Tamm, Maria & Bernhard Wälchli. 2001. The
Circum-Baltic languages: An areal-typological approach. In Östen Dahl & Maria Koptjevskaja-Tamm (eds.), The
Circum-Baltic languages: Typology and contact. Volume 1: Grammar and typology (Studies in Language
Companion Series 55), 615–750. Amsterdam, Philadelphia: John Benjamins.
Kowalik, Richard. (forthcoming). A
grammar of spoken South Saami. Stockholm University doctoral dissertation.
Laakso, Johanna. 2020. Contact
and the Finno-Ugric languages. In Raymond Hickey (ed.), The
handbook of language contact, 2nd
edition, 519–535. Wiley-Blackwell.
Lehtinen, Jyri, Terhi Honkola, Kalle Korhonen, Kaj Syrjänen, Niklas Wahlberg & Outi Vesakoski. 2014. Behind
family trees. Language Dynamics and
Change 4(2). 189–221.
Magga, Ole Henrik. 2014. Lullisámegiela muohtasánit [South Saami snow
terminology]. Sámi dieđalaš
áigečála 11. 27–49.
Miestamo, Matti. 2018. On
the relationship between typology and the description of Uralic languages. Journal of Estonian
and Finno-Ugric
Linguistics 9(1). 31–53.
Miestamo, Matti, Anne Tamm & Beáta Wagner-Nagy (eds.). 2015. Negation
in Uralic languages (Typological Studies in Language
108). Amsterdam: Benjamins.
Nichols, Johanna. 2021. The
origin and dispersal of Uralic: Distributional typological view. Annual Review of
Linguistics 7(1). 351–369.
Norvik, Miina, Yingqi Jing, Michael Dunn, Robert Forkel, Terhi Honkola, Gerson Klumpp, Richard Kowalik, Helle Metslang, Karl Pajusalu, Minerva Piha, Eva Saar, Sirkka Saarinen & Outi Vesakoski. 2021. Uralic
Typological database – UraTyp. Zenodo.
Pajusalu, Karl, Kristel Uiboaed, Péter Pomozi, Endre Németh & Tibor Fehér. 2018. Towards
a phonological typology of Uralic languages. Eesti ja soome-ugri keeleteaduse ajakiri. Journal
of Estonian and Finno-Ugric
Linguistics 9(1). 187–207.
Piha, Minerva. 2018. Combining
Proto-Scandinavian loanword strata in South Saami with the Early Iron Age archaeological material of Jämtland and Dalarna,
Sweden. Finnisch-Ugrische
Forschungen 641. 118–233.
Piha, Minerva & Jaakko Häkkinen. 2020. Eteläsaamesta kantaeteläsaameen. Lainatodisteita eteläsaamen varhaisesta
eriytymisestä [From Proto-Saami to Southern Proto-Saami. Loan evidence of the early drift of South Saami]. Sananjalka 621. 102–124.
Pritchard, Jonathan K., Matthew Stephens & Peter Donnelly. 2000. Inference
of population structure using multilocus genotype
data. Genetics 1551. 945–59.
Rantanen, Timo, Outi Vesakoski, Jussi Ylikoski & Harri Tolvanen. 2021. Geographical
database of the Uralic languages. Zenodo.
Reesink, Ger, Ruth Singer & Michael Dunn. 2009. Explaining
the linguistic diversity of Sahul using population models. PLoS
biology 7(11). e1000241.
Róna-Tas, András. 1988. Turkic
influence on the Uralic languages. In Denis Sinor (ed.), The
Uralic languages. Description, history and foreign
influences, 742–780. Leiden, New York, København, Köln: E. J. Brill.
Saarinen, Sirkka. 1997. Language
contacts in the Volga region: Loan suffixes and calques in Mari and
Udmurt. In Heinrich Ramisch & Kenneth Wynne (eds.), Language
in time and space. Studies in honour of Wolfgang Viereck on the occasion of his 60th
birthday, 388–396. Stuttgart: Franz Steiner Verlag.
Skirgård, Hedvig, H. J. Haynie, Harald Hammarström, D. E. Blasi et al. Grambank
data reveal global patterns in the structural diversity of the world’s languages. Submitted
manuscript.
Syrjänen, Kaj. 2021. Quantitative
language evolution: Case studies in Finnish dialects and Uralic languages. Tampere University doctoral dissertation. ([URL]) (Accessed 12-08-2021.)
Syrjänen, Kaj, Terhi Honkola, Jyri Lehtinen, Antti Leino & Outi Vesakoski. 2016. Applying
population genetic approaches within languages: Finnish dialects as linguistic
populations. Language Dynamics and
Change 6 (2), 235–283.
Veenker, Wolfgang (ed.). 1985. Dialectologia Uralica. Materialen der ersten internationalen Symposium zur Dialektologie der uralischen Sprachen
4.–7. September 1984 in Hamburg [Dielectologia Uralica. Materials of the first
international symposium on the dialectology of Uralic languages, 4–7 September 1984,
Hamburg] (Veröffentlichungen der Societas Uralo-Altaica
20). Wiesbaden: Harrassowitz.
Wilkinson, M. D., M. Dumontier, I. J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg et al. 2016. The
FAIR Guiding Principles for scientific data management and stewardship. Scientific
Data 3 (1). 160018.
Ylikoski, Jussi. 2016. The
origins of the western Uralic s-cases revisited: Historiographical, functional-typological and Samoyedic
perspectives. Finnisch-Ugrische
Forschungen 631. 6–78.
Cited by (2)
Cited by two other publications
Hübler, Nataliia & Simon J Greenhill
2022.
Modelling admixture across language levels to evaluate deep history claims.
Journal of Language Evolution 7:2
► pp. 166 ff.
Rantanen, Timo, Harri Tolvanen, Meeli Roose, Jussi Ylikoski, Outi Vesakoski & Søren Wichmann
2022.
Best practices for spatial language data harmonization, sharing and map creation—A case study of Uralic.
PLOS ONE 17:6
► pp. e0269648 ff.
This list is based on CrossRef data as of 15 september 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.