Cluster analysis is an exploratory data analysis technique, encompassing a number of different algorithms and methods for sorting objects into groups. Cluster analysis requires the analyst to make choices about dissimilarity measures, grouping algorithms, etc., and these choices are difficult to make without an understanding of their theoretical implications and a very good understanding of the data. This chapter provides an introduction to the distance measures and clustering algorithms most commonly used for cluster analytic work. Different from Baayen (2008), Johnson (2008) and Gries (2009), its main aim is to equip the researcher with at least a basic understanding of what is happening behind the scenes when a dataset is explored with the help of a particular cluster analytic technique.
(1986) Metric and Euclidean properties of dissimilarity coefficients. Journal of Classification, 3(1), 5–48.
Gries, St. Th
(2009) Statistics for linguistics with R: A practical introduction. Berlin: Mouton de Gruyter.
Harnad, S
(2005) To cognize is to categorize: Cognition is categorization. In C. Lefebvre & H. Cohen (Eds.), Handbook on categorization (pp. 19–43). Oxford & London: Elsevier.
Hennig, C
(2010) fpc: Flexible procedures for clustering. R package version 2.0-3. [URL].
Johnson, K
(2008) Quantitative methods in linguistics. New York: Wiley-Blackwell.
Kaufman, L., & Rousseeuw, P.J
(1990) Finding groups in data: An introduction to cluster analysis (Series in Applied Probability and Statistics). New York: Wiley-Blackwell.
Milligan, G.W., & Cooper, M.C
(1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179.
R Development Core Team
(2008) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. [URL].
Rousseeuw, P.J
(1987) Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(1), 53–65.
Shaw, D
(1974) Statistical analysis of dialectal boundaries. Computers and the Humanities, 8, 173–177.
Suzuki, R., & Shimodaira, H
An R package for hierarchical clustering with p-values. Retrieved from [URL] [Accessed 25 May 2012].
2019. From Athenian fleet to prophetic eschatology. Correlating formal features to themes of discourse in Ancient Greek. Folia Linguistica 53:s40-s2 ► pp. 355 ff.
2018. Application of Linguistic Clustering to Define Sources of Risks in Technical Projects. ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 4:1
2023. Towards a dynamic behavioral profile of the Mandarin Chinese temperature termre: a diachronic semasiological approach. Corpus Linguistics and Linguistic Theory 19:2 ► pp. 289 ff.
Milin, Petar, Benjamin V. Tucker & Dagmar Divjak
2023. A learning perspective on the emergence of abstractions: the curious case of phone(me)s. Language and Cognition 15:4 ► pp. 740 ff.
2022. Indonesian basic olfactory terms: more negative types but more positive tokens. Cognitive Linguistics 33:3 ► pp. 447 ff.
SUGAWARA, Yuki & Kazuho KAMBARA
2023. <i>The Many Uses of Explain:</i>. Annals of the Japan Association for Philosophy of Science 32:0 ► pp. 23 ff.
Torres, Peter Joseph
2021. The role of modals in policies: The US opioid crisis as a case study. Applied Corpus Linguistics 1:3 ► pp. 100008 ff.
Van den Heede, Margot & Peter Lauwers
2023. Syntactic productivity under the microscope: the lexical and semantic openness of Dutch minimizing constructions. Folia Linguistica 57:3 ► pp. 723 ff.
Vandevoorde, Lore
2019. Register, Source Language, and Cognateness Effects on Lexical Choice in Translated Dutch. Meta 63:3 ► pp. 627 ff.
Vandevoorde, Lore, Els Lefever, Koen Plevoets & Gert De Sutter
2022. A Corpus-Based Study of Semantic Categorizations of Attracted Adjectives to the it BE ADJ clause Construction. SAGE Open 12:2 ► pp. 215824402210912 ff.
Wu, Shuqiong & Yue Ou
2023.
A quantitative study of the polysemy of Mandarin Chinese perception verb
kàn
‘look/see’
. Australian Journal of Linguistics 43:3 ► pp. 191 ff.
Zhou, Jiangping
2023. A corpus-based study of explicit objective modal expressions in English. Studia Neophilologica 95:1 ► pp. 100 ff.
This list is based on CrossRef data as of 20 april 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.