Vol. 46:4 (2022) ► pp.753–792
Derivation predicting inflection
A quantitative study of the relation between derivational history and inflectional behavior in Latin
In this paper, we investigate the value of derivational information in predicting the inflectional behavior of lexemes. We focus on Latin, for which large-scale data on both inflection and derivation are easily available. We train boosting tree classifiers to predict the inflection class of verbs and nouns with and without different pieces of derivational information. For verbs, we also model inflectional behavior in a word-based fashion, training the same type of classifier to predict wordforms given knowledge of other wordforms of the same lexemes. We find that derivational information is indeed helpful, and document an asymmetry between the beginning and the end of words, in that the final element in a word is highly predictive, while prefixes prove to be uninformative. The results obtained with the word-based methodology also allow for a finer-grained description of the behavior of different pairs of cells.
Article outline
- 1.Introduction
- 2.Data collection and annotation
- 3.Predicting inflection classes
- 3.1Methodology
- 3.2Predicting the conjugation of verbs
- 3.3Predicting the declension of nouns
- 4.A word-based alternative: Predicting verb forms
- 4.1Rationale
- 4.2The information-theoretic approach to the PCFP
- 4.3The PCFP as a classification problem
- 4.4Results
- 5.Conclusions
- Acknowledgements
- Notes
- Appendix
-
References
https://doi.org/10.1075/sl.21002.bon