Article published in:Historical Linguistics 2007: Selected papers from the 18th International Conference on Historical Linguistics, Montreal, 6–11 August 2007
Edited by Monique Dufresne, Fernande Dupuis and Etleva Vocaj
[Current Issues in Linguistic Theory 308] 2009
► pp. 269–284
Visualization, validation and seriation
Application to a corpus of medieval texts
Principal axes methods (such as correspondence analysis [CA]) provide useful visualizations of high-dimensional data sets. In the context of historical textual data, these techniques produce planar maps highlighting the associations between graphemes and texts (paragraphs, chapters, full texts, authors). First, we recall that a simple technique of seriation (re-ordering the rows and columns of a table) is readily derived from the first CA axis. Second, we stress the important role played by bootstrap techniques to allow for valid statistical inferences in a context in which a classical analytical approach is both unrealistic and analytically complex. A series of medieval French texts (12th–13th centuries), rich in spelling variants, exemplify the proposed approaches. A free software program is available.
Published online: 30 November 2009