Part ofLanguage and Text: Data, models, information and applications
Edited by Adam Pawłowski, Jan Mačutek, Sheila Embleton and George Mikros
[Current Issues in Linguistic Theory 356] 2021
► pp. 209–224
This paper addresses an issue in visualization of high-dimensional data abstracted from historical corpora whose importance in quantitative and corpus linguistics has thus far not been sufficiently appreciated: the possibility that the data is nonlinear. Most applications of data visualization in these fields use linear proximity measures which ignore nonlinearity, and, if the data is significantly nonlinear, can give misleading results. Topological mapping is a nonlinear visualization method, and its application via a particular topological mapping method, the Self-Organizing Map, is here exemplified with reference to a small historical text corpus.