Article published in:

Corpus Methods for Semantics: Quantitative studies in polysemy and synonymyEdited by Dylan Glynn and Justyna A. Robinson

[Human Cognitive Processing 43] 2014

► pp. 405–441

# Cluster analysis

## Finding structure in linguistic data

**Dagmar Divjak**| University of Sheffield

**Nick Fieller**| University of Sheffield

Cluster analysis is an exploratory data analysis technique, encompassing a number of different algorithms and methods for sorting objects into groups. Cluster analysis requires the analyst to make choices about dissimilarity measures, grouping algorithms, etc., and these choices are difficult to make without an understanding of their theoretical implications and a very good understanding of the data. This chapter provides an introduction to the distance measures and clustering algorithms most commonly used for cluster analytic work. Different from Baayen (2008), Johnson (2008) and Gries (2009), its main aim is to equip the researcher with at least a basic understanding of what is happening behind the scenes when a dataset is explored with the help of a particular cluster analytic technique.

**Keywords:**clustering algorithms, distance measures

Published online: 06 November 2014

https://doi.org/10.1075/hcp.43.16div

https://doi.org/10.1075/hcp.43.16div

## References

Alviar, J.J

Baayen, R.H

Backhaus, K., Erichson, B., Plinke, W., & Weiber, R

Brock, G., Pihur, V., Datta, S., & Datta, S

(2011) clValid: Validation of clustering results.

*Journal of Statistical Software*, 25(4), March 2008 R package version 0.6-2. <http://CRAN.R- project.org/package=clValid>.Divjak, D., & Gries, St. Th

Everitt, B.S., Landau, S., Leese, M., & Stahl, D

Gower, J., & Legendre, P

Gries, St. Th

Harnad, S

Hennig, C

(2010) fpc: Flexible procedures for clustering. R package version 2.0-3. <http://CRAN.R-project.org/package=fpc.

Kaufman, L., & Rousseeuw, P.J

Milligan, G.W., & Cooper, M.C

R Development Core Team

(2008) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. <http://www.R-project.org>.

Rousseeuw, P.J

Shaw, D

Suzuki, R., & Shimodaira, H

An R package for hierarchical clustering with

*p*-values. Retrieved from <http://www.is.titech.ac.jp/~shimo/prog/pvclust [Accessed 25 May 2012].## Cited by

## Cited by 16 other publications

*No author info given*

Dattner, Elitzur

Desagulier, Guillaume

Desagulier, Guillaume

Ioannou, Georgios

Ioannou, Georgios

Johansson, Marjut & Veronika Laippala

Kifokeris, Dimosthenis & Yiannis Xenidis

Liu, Meili

Proos, Mariann

Torres, Peter Joseph

Vandevoorde, Lore

Vandevoorde, Lore, Els Lefever, Koen Plevoets & Gert De Sutter

Wang, Jiaojiao & Jiangping Zhou

Zhou, Jiangping

This list is based on CrossRef data as of 31 july 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.