Vol. 27:2 (2022) ► pp.166–190
The hapax / type ratio
An indicator of minimally required sample size in productivity studies?
This article addresses one of the lesser-known productivity measures, namely the hapax / type ratio (HTR). Through a case study involving the Dutch semi-copula raken (“attain”), it is shown that the HTR more or less stabilizes from a certain sample size onwards. Moreover, this point of stabilization seems to coincide with an increased permanency of the hapaxes, i.e. the share of hapaxes that convert quickly to non-hapaxes is not as large as was the case at the beginning of the sampling process. Therefore, the stabilization of the HTR might be a good indicator of minimally required sample size in productivity studies, suggesting that the hapaxes are ‘non-incidental’ from this sample size onwards. However, I did not find a clear link between the onset of the stabilization of the HTR and the extent to which the inventory of types accounted for at the top of the frequency distribution is (quasi-)complete.
Article outline
- 1.Introduction
- 2.Quantitative measures gauging linguistic productivity
- 3.Focus on the hapax / type ratio
- 4.A sample-wide view on the shape of the hapax / type ratio
- 5.The case of the Dutch semi-copula raken and its hapax / type ratio
- 5.1Hapax stability
- 5.2Completeness of the frequency summit
- 6.Conclusions
- Acknowledgements
- Notes
-
References
https://doi.org/10.1075/ijcl.19114.van