Chapter 9
Identification of clusters of lexical areas using geographical factors
A case study in the Occitan language area
We propose a multidimensional statistical analysis procedure using projection and clustering methods in order to identify coherent clusters in a set of lexical areas. The methodology includes a geographical factor, such as administrative divisions or land cover features, to help the identification of clusters. By applying this method on data from the Occitan language area in the south of France, we are able to identify new spatial patterns and lexical boundaries that do not match traditional dialect boundaries. Our method helps to suggest possible explanations for these new patterns.
Article outline
- 1.Context
- 2.Method
- 2.1Representation space
- 2.2Barycentric projection
- 2.3Clustering
- 3.Implementation of the method
- 3.1Visual exploration
- 3.2Cluster characterization
- 4.Case study: Occitan
- 5.Conclusion
-
Notes
-
Bibliography