Vol. 40:2 (2023) ► pp.238–285
Large-scale computerized forward reconstruction yields new perspectives in French diachronic phonology
Traditionally, historical phonologists have relied on tedious manual derivations to sequence the sound changes that have shaped the phonological evolution of languages. However, humans are prone to errors, and cannot track thousands of parallel derivations in any efficient manner. We demonstrate computerized forward reconstruction (CFR), deriving each etymon in parallel, as a task with metrics to optimize, and as a tool which drastically facilitates inquiry. To this end we present DiaSim, an application which simulates “cascades” of diachronic developments over a language’s lexicon and provides various diagnostics for “debugging” those cascades. We test our method on a Latin-to-French reflex prediction task, using a newly compiled, publicly available dataset FLLex consisting of 1368 paired Latin and Modern French forms. We also introduce a second dataset, FLLAPS, which maps 310 reflexes from Latin through five attested intermediate stages up to Modern French, derived from Pope’s (1934) periodic development tables. We present publicly available rule cascades: the baseline BaseCLEF and BaseCLEF* cascades, based on Pope’s (1934) widely-cited view of French development, and DiaCLEF, made from incremental corrections to BaseCLEF aided by DiaSim’s diagnostics. DiaCLEF outperforms the baselines by large margins, improving raw accuracy on FLLex from 3.2% to 84.9% of etyma, with similarly large improvements for each of FLLAPS’ periods. Changes were made to build DiaCLEF considering only the baseline and DiaSim’s diagnostics, but they often independently reproduced past work in French diachronic phonology, corroborating both our procedure and past endeavors; we discuss the implications of some of our findings in detail.
Article outline
- 1.Introduction
- 2.Background
- 2.1French phonological history
- 2.2Computerized forward reconstruction (CFR)
- 3.Contributions
- 4.Iterative refinement of an analysis using DiaSim
- 5.DiaSim
- 5.1Transparent mass simulation
- 5.1.1Performance metrics
- 5.2Diagnostics
- 5.3Consistency with longstanding theory
- 5.1Transparent mass simulation
- 6.Datasets
- 6.1FLLex
- 6.2FLLAPS
- 7.Rule cascades
- 7.1BaseCLEF
- 7.2DiaCLEF
- 8.Results and discussion
- 8.1A regular account of “sporadic” k-voicing
- 8.2Major re-orderings
- 8.2.1Alveolar deaffrication counterfeeding vowel lengthening
- 8.3Retention of Latin b/v distinction into Gallo-Roman
- 8.4Classical French grammarians as reliable primary sources?
- 8.4.1Dating of /rr/ degemination
- 8.4.2Pre-rhotic lowering: Prescriptivist miracle or prescriptivist error?
- 9.Conclusion
- Acknowledgements
- Notes
-
References
https://doi.org/10.1075/dia.20027.mar