Investigating effects of criterial consistency, the diversity dimension, and threshold variation in formulaic language research
Extending the methodological considerations of O’Donnell et al. (2013)
Xiaofei Lu | The Pennsylvania State University
Olesya Kisselev | The Pennsylvania State University
Jungwan Yoon | The Pennsylvania State University
Michael D. Amory | The Pennsylvania State University
O’Donnell et al. (2013) considered four measures of formulaicity and reported that they produced different results concerning the effects of expertise and first/second language status on formulaic sequence usage in academic writing. The current study explores several additional methodological issues using the same dataset from O’Donnell et al. (2013). We first motivate the need for criterial consistency and investigate whether frequency- and association-based measures yield different results when they are both obtained using corpus-internal criteria. The informativeness of the diversity dimension of formulaic sequence use is then gauged by comparing the results of phrase-frame type-token ratio against those of other measures. Finally, we profile formulaic sequence distribution across quartiles of different measures to assess the effect of variable measure thresholds. Our findings highlight the criticality of issues of criterial consistency, formulaic sequence diversity, and threshold variation in formulaic language research.
Keywords: formulaic language, n-gram frequency, mutual information, phrase frames, phrase-frame type-token ratio
Article outline
- 1.Introduction
- 2.Methodological issues in formulaic sequence identification and extraction
- 3.Motivation for the current study
- 4.Method
- 4.1Data
- 4.2Measures
- 4.2.1N-gram frequency
- 4.2.2N-gram MI
- 4.2.3P-frame frequency
- 4.2.4P-frame TTR
- 4.3Procedure
- 5.Results
- 5.1Research question 1: Corpus-internal vs. corpus-external MI thresholds
- 5.2Research questions 2 and 3: Effects of expertise
- 5.2.1Frequency-based n-grams
- 5.2.2MI-defined formulas
- 5.2.3P-frames
- 5.2.4P-frame TTR
- 5.3Research questions 2 and 3: Effects of L1/L2 status
- 5.3.1Frequency-based n-grams
- 5.3.2MI-defined formulas
- 5.3.3P-frames
- 5.3.4P-frame TTR
- 6.Discussion
- 7.Conclusions
- Acknowledgements
- Notes
-
References
Published online: 05 October 2018
https://doi.org/10.1075/ijcl.16086.lu
https://doi.org/10.1075/ijcl.16086.lu
References
Bannard, C., & Lieven, E.
Biber, D.
Biber, D., Conrad, S., & Cortes, V.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E.
Conklin, K., & Schmitt, N.
Cortes, V.
Durrant, P., & Doherty, A.
Ellis, N. C.
Eskildsen, S. W.
Eskildsen, S. W., & Cadierno, T.
(2007) Are recurring multi-word expressions really syntactic freezes? Second language acquisition from the perspective of usage-based linguistics. In M. Nenonen & S. Niemi (Eds.), Collocations and Idioms 1: Papers from the First Nordic Conference on Syntactic Freezes (pp. 86–99). Joensuu: Joensuu University Press.
Evert, S.
Granger, S.
Granger, S., & Meunier, F.
Gries, S., & Wulff, S.
Herbst, T.
Laufer, B., & Nation, P.
Lieven, E., & Tomasello, M.
McEnery, T., & Hardy, A.
Manning, C., & Schütze, H.
McEnery, T., & Wilson, A.
Mel’čuk, I.
O’Donnell, M., Römer, U., & Ellis, N. C.
Paquot, M. B., & Granger, S.
Pawley, A., & Syder, F. H.
Pivovarova, L., Kormacheva, D., & Kopotev, M.
Römer, U.
Römer, U., & O’Donnell, M. B.
Schmitt, N., & Carter, R.
Simpson-Vlach, R., & Ellis, N. C.
Tomasello, M.
Cited by
Cited by 2 other publications
Lu, Xiaofei & Renfen Hu
Pan, Fan, Randi Reppen & Douglas Biber
This list is based on CrossRef data as of 15 april 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.