The hapax / type ratio
An indicator of minimally required sample size in productivity studies?
This article addresses one of the lesser-known productivity measures, namely the hapax / type ratio (HTR). Through a case study involving the Dutch semi-copula raken (“attain”), it is shown that the HTR more or less stabilizes from a certain sample size onwards. Moreover, this point of stabilization seems to coincide with an increased permanency of the hapaxes, i.e. the share of hapaxes that convert quickly to non-hapaxes is not as large as was the case at the beginning of the sampling process. Therefore, the stabilization of the HTR might be a good indicator of minimally required sample size in productivity studies, suggesting that the hapaxes are ‘non-incidental’ from this sample size onwards. However, I did not find a clear link between the onset of the stabilization of the HTR and the extent to which the inventory of types accounted for at the top of the frequency distribution is (quasi-)complete.
Article outline
- 1.Introduction
- 2.Quantitative measures gauging linguistic productivity
- 3.Focus on the hapax / type ratio
- 4.A sample-wide view on the shape of the hapax / type ratio
- 5.The case of the Dutch semi-copula raken and its hapax / type ratio
- 5.1Hapax stability
- 5.2Completeness of the frequency summit
- 6.Conclusions
- Acknowledgements
- Notes
-
References
References (25)
Baayen, R. H.
(
1992)
Quantitative aspects of morphological productivity. In
G. Booij &
J. van Marle (Eds.),
Yearbook of Morphology 1991 (pp. 109–149). Springer.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Baayen, R. H.
(
2001)
Word Frequency Distributions. Springer.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Baayen, R. H.
(
2009)
Corpus linguistics in morphology: Morphological productivity. In
A. Lüdeling &
M. Kytö (Eds.),
Corpus Linguistics (pp. 899–919). De Gruyter Mouton.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Baayen, R. H., & Lieber, R.
(
1991)
Productivity and English derivation: A corpus-based study.
Linguistics,
29
(5), 801–844.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bybee, J.
(
2010)
Language, Usage and Cognition. Cambridge University Press.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cvrček, V.
(
2011)
How large is the core of language? In
Proceedings of the Corpus Linguistics Conference 2011 (
Paper#145). University of Birmingham.
[URL]
Desagulier, G.
(
2016)
A lesson from associative learning: Asymmetry and productivity in multiple-slot constructions.
Corpus Linguistics and Linguistic Theory,
12
(2), 173–219.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Evert, S.
(
2004)
A simple LNRE model for random character sequences. In
G. Purnelle,
C. Fairon, &
A. Dister (Eds.),
Proceedings of JADT (pp. 411–422). Presses universitaires de Louvain.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Evert, S., & Baroni, M.
(
2006)
Testing the extrapolation quality of word frequency models. In
P. Danielsson &
M. Wagenmakers (Eds.),
Proceedings of Corpus Linguistics 2005. University of Birmingham.
[URL]
Fan, F.
(
2010)
An asymptotic model for the English hapax/vocabulary ratio.
Computational Linguistics,
36
(4), 631–637.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Goldberg, A. E.
(
1995)
Constructions: A Construction Grammar Approach to Argument Structure. University of Chicago Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Goldberg, A. E.
(
2016)
Partial productivity of linguistic constructions: Dynamic categorization and statistical preemption.
Language and Cognition,
8
(3), 369–390.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hartmann, S.
(
2018)
Derivational morphology in flux: A case study of word-formation change in German.
Cognitive Linguistics,
29
(1), 77–119.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hilpert, M.
(
2013)
Constructional Change in English: Developments in Allomorphy, Word Formation, and Syntax. Cambridge University Press.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kemmer, S., & Barlow, M.
(
2000)
Introduction: A usage-based conception of language. In
S. Kemmer &
M. Barlow (Eds.),
Usage-Based Models of Language (pp. 7–28). CSLI Publications.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., & Suchomel, V.
(
2014)
The Sketch Engine: Ten years on.
Lexicography,
1
(1), 7–36.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Lauwers, P., & Tobback, E.
(
2010)
Les verbes attributifs: Inventaire(s) et statut(s) [Copular Verbs: Inventor(-y/-ies) and Status(es)].
Langages,
179–180
(3), 79–113.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Perek, F.
(
2016)
Using distributional semantics to study syntactic productivity in diachrony: A case study.
Linguistics,
54
(1), 149–188.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Suttle, L., & Goldberg, A. E.
(
2011)
The partial productivity of constructions as induction.
Linguistics,
49
(6), 1237–1269.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Van Eynde, F.
(
2015)
Predicative Constructions: A Monostratal Montagovian Treatment. CSLI Publications.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Van Wettere, N.
(
2018)
Copularité et Productivité: Une Analyse Contrastive des Verbes Attributifs Issus de Verbes de Mouvement en Français et en Néerlandais [
Copularity and Productivity: A Contrastive Analysis of Copular Verbs Originating from Motion Verbs in French and Dutch
] [Doctoral dissertation, Ghent University]. Academic Bibliography @ Ghent University.
[URL]
Zeldes, A.
(
2012)
Productivity in Argument Selection from Morphology to Syntax. De Gruyter Mouton.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cited by (1)
Cited by 1 other publications
Van den Heede, Margot & Peter Lauwers
2023.
Syntactic productivity under the microscope: the lexical and semantic openness of Dutch minimizing constructions.
Folia Linguistica 57:3
► pp. 723 ff.
![DOI logo](//benjamins.com/logos/doi-logo.svg)
This list is based on CrossRef data as of 4 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.