References (100)
References
Alvarez, R. M., & Heuberger, S. (2022). How (not) to reproduce: Practical considerations to improve research transparency in political science. PS: Political Science & Politics, 55 (1), 149–154. DOI logoGoogle Scholar
Anderson, S. F., & Maxwell, S. E. (2016). There’s more than one way to conduct a replication study: Beyond statistical significance. Psychological Methods, 21 (1), 1–12. DOI logoGoogle Scholar
Andringa, S., & Godfroid, A. (2020). Sampling bias and the problem of generalizability in applied linguistics. Annual Review of Applied Linguistics, 40 1, 134–142. DOI logoGoogle Scholar
Artner, R., Verliefde, T., Steegen, S., Gomes, S., Traets, F., Tuerlinckx, F., & Vanpaemel, W. (2021). The reproducibility of statistical results in psychological research: An investigation using unpublished raw data. Psychological Methods, 26 (5), 527–546. DOI logoGoogle Scholar
Bakker, M., & Wicherts, J. M. (2011). The (mis)reporting of statistical results in psychology journals. Behavior Research Methods, 43 (3), 666–678. DOI logoGoogle Scholar
Barth, D., & Kapatsinski, V. (2017). A multimodel inference approach to categorical variant choice: Construction, priming and frequency effects on the choice between full and contracted forms of am, are and is . Corpus Linguistics and Linguistic Theory, 13 (2), 203–260. DOI logoGoogle Scholar
Belz, A., Agarwal, S., Shimorina, A., & Reiter, E. (2021). A systematic review of reproducibility research in natural language processing. In P. Merlo, J. Tiedemann, & R. Tsarfaty (Eds.), Proceedings of the 16th conference of the European chapter of the Association for Computational Linguistics: Main volume (pp. 381–393). Association for Computational Linguistics. DOI logoGoogle Scholar
Bernaisch, T., Gries, S. Th., & Heller, B. (2022). Theoretical models and statistical modelling of linguistic epicentres. World Englishes, 41 (3), 333–346. DOI logoGoogle Scholar
Biber, D. (1988). Variation across speech and writing. Cambridge University Press. DOI logoGoogle Scholar
Bisang, W. (2011). Variation and reproducibility in linguistics. In P. Siemund (Ed.), Linguistic universals and language variation (pp. 237–263). De Gruyter Mouton. DOI logoGoogle Scholar
BNC Consortium. (2007). British National Corpus (version 3, BNC XML ed.). [URL]
Bollen, K., Cacioppo, J. T., Kaplan, R. M., Krosnick, J. A., & Olds, J. L. (2015). Social, behavioral, and economic sciences perspectives on robust and reliable science. Report of the Subcommittee on Replicability in Science Advisory Committee to the National Science Foundation Directorate for Social, Behavioral, and Economic Sciences, National Science Foundation.Google Scholar
Brezina, V., McEnery, T., & Wattam, S. (2015). Collocations in context: A new perspective on collocation networks. International Journal of Corpus Linguistics, 20 (2), 139–173. DOI logoGoogle Scholar
Brezina, V., & Meyerhoff, M. (2014). Significant or random?: A critical review of sociolinguistic generalisations based on large corpora. International Journal of Corpus Linguistics, 19 (1), 1–28. DOI logoGoogle Scholar
Brezina, V., & Timperley, M. (2017). How large is the BNC? A proposal for standardised tokenization and word counting. [Conference presentation]. Corpus linguistics conference 2017, Birmingham, UK.
Burch, B., & Egbert, J. (2022a). Confidence intervals for ratios of means applied to corpus-based word frequency classes. Journal of Applied Statistics, 50 (7), 1592–1610. DOI logoGoogle Scholar
(2022b). Word use equivalence and hierarchical word tiers. Journal of Quantitative Linguistics, 30 (1), 104–124. DOI logoGoogle Scholar
Burch, B., Egbert, J., & Biber, D. (2017). Measuring and interpreting lexical dispersion in corpus linguistics. Journal of Research Design and Statistics in Linguistics and Communication Science, 3 (2), 189–216. DOI logoGoogle Scholar
Claerbout, J. F., & Karrenbach, M. (1992). Electronic documents give reproducible research a new meaning. In SEG Technical Program expanded abstracts 1992, (pp. 601–604). DOI logoGoogle Scholar
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49 (12), 997–1003. DOI logoGoogle Scholar
Doyle, P. G. (2003). Replicating corpus linguistics: A corpus-driven investigation of lexical networks in texts [Unpublished PhD thesis]. Lancaster University.
Earp, B. D., & Trafimow, D. (2015). Replication, falsification, and the crisis of confidence in social psychology. Frontiers in Psychology, 6 1, Article 621. DOI logoGoogle Scholar
Egbert, J., & Baker, P. (Eds.). (2021). Using corpus methods to triangulate linguistic analysis. Routledge. DOI logoGoogle Scholar
Egbert, J., & Biber, D. (2019). Incorporating text dispersion into keyword analyses. Corpora, 14 (1), 77–104. DOI logoGoogle Scholar
Egbert, J., Biber, D., & Gray, B. (2022). Designing and evaluating language corpora: A practical framework for corpus representativeness. Cambridge University Press. DOI logoGoogle Scholar
Egbert, J., Burch, B., & Biber, D. (2020). Lexical dispersion and corpus design. International Journal of Corpus Linguistics, 25 (1), 89–115. DOI logoGoogle Scholar
Egbert, J., Larsson, T., & Biber, D. (2020). Doing linguistics with a corpus: Methodological considerations for the everyday user (1st ed.). Cambridge University Press. DOI logoGoogle Scholar
Eubank, N. (2016). Lessons from a decade of replications at the quarterly journal of political science. PS: Political Science & Politics, 49 (02), 273–276. DOI logoGoogle Scholar
Flanagan, J. (2017). Reproducible research: Strategies, tools, and workflows. In T. Hiltunen, J. McVeigh, & T. Säily (Eds.), Big and rich data in English corpus linguistics: Methods and explorations. VARIENG. [URL]
Fletcher, S. C. (2021). How (not) to measure replication. European Journal for Philosophy of Science, 11 (2), 57. DOI logoGoogle Scholar
Fuscone, S., Favre, B., & Prévot, L. (2021). Reproducibility in speech rate convergence experiments. Language Resources and Evaluation, 55 (3), 817–832. DOI logoGoogle Scholar
Gawne, L., & Berez-Kroeker, A. L. (2018). Reflections on reproducible research. In B. McDonnell, A. L. Berez-Kroeker, & G. Holton (Eds.), Reflections on language documentation 20 years after Himmelmann 1998. (pp. 22–32). University of Hawai’i Press. [URL]
Gelman, A., & Loken, E. (2013, November 13). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. [URL]
Gelman, A., & Stern, H. (2006). The difference between “significant” and “not significant” is not itself statistically significant. The American Statistician, 60 (4), 328–331. DOI logoGoogle Scholar
Gervais, W. M. (2021). Practical methodological reform needs good theory. Perspectives on Psychological Science, 16 (4), 827–843. DOI logoGoogle Scholar
Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-Economics, 33 (5), 587–606. DOI logoGoogle Scholar
Gries, S. Th. (2015). The most under-used statistical method in corpus linguistics: Multi-level (and mixed-effects) models. Corpora, 10 (1), 95–125. DOI logoGoogle Scholar
(2020). Analyzing dispersion. In M. Paquot & S. T. Gries (Eds.), A practical handbook of corpus linguistics (pp. 99–118). Springer International Publishing. DOI logoGoogle Scholar
(2021). (Generalized linear) mixed-effects modeling: A learner corpus example. Language Learning, 71 (3), 757–798. DOI logoGoogle Scholar
(2022a). What do (most of) our dispersion measures measure (most)? Dispersion? Journal of Second Language Studies, 5 (2), 171–205. DOI logoGoogle Scholar
(2022b). Toward more careful corpus statistics: Uncertainty estimates for frequencies, dispersions, association measures, and more. Research Methods in Applied Linguistics, 1 (1), Article 100002. DOI logoGoogle Scholar
Gries, S. Th., & Paquot, M. (2020). Writing up a corpus-linguistic paper. In M. Paquot & S. Th. Gries (Eds.), A practical handbook of corpus inguistics (pp. 647–659). Springer International Publishing. DOI logoGoogle Scholar
Hackert, S. (2008). Counting and coding the past: Circumscribing the variable context in quantitative analyses of past inflection. Language Variation and Change, 20 (1), 127–153. DOI logoGoogle Scholar
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., & Stahel, W. A. (1986). Robust statistics: The approach based on influence functions (1st ed.). Wiley. DOI logoGoogle Scholar
Hardwicke, T. E., Bohn, M., MacDonald, K., Hembacher, E., Nuijten, M. B., Peloquin, B. N., deMayo, B. L., Yoon, E. J., & Frank, M. C. (2021). Analytic reproducibility in articles receiving open data badges at the journal Psychological Science: An observational study. R. Soc. Open Sci., 8 1, Article 201494. DOI logoGoogle Scholar
Hardwicke, T. E., Wallach, J. D., Kidwell, M. C., Bendixen, T., Crüwell, S., & Ioannidis, J. P. A. (2020). An empirical assessment of transparency and reproducibility-related research practices in the social sciences (2014–2017). R. Soc. Open Sci., 7 1, Article 190806. DOI logoGoogle Scholar
Hundt, M. (2021). On models and modelling. World Englishes, 40 (3), 298–317. DOI logoGoogle Scholar
In’nami, Y., Mizumoto, A., Plonsky, L., & Koizumi, R. (2022). Promoting computationally reproducible research in applied linguistics: Recommended practices and considerations. Research Methods in Applied Linguistics, 1 (3), Article 1000030. DOI logoGoogle Scholar
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2 (8), e124. DOI logoGoogle Scholar
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23 (5), 524–532. DOI logoGoogle Scholar
Kytö, M., & Smitterberg, E. (2015). Diachronic registers. In D. Biber & R. Reppen (Eds.), The Cambridge handbook of English corpus linguistics (pp. 330–345). Cambridge University Press. DOI logoGoogle Scholar
Laurinavichyute, A., Yadav, H., & Vasishth, S. (2022). Share the code, not just the data: A case study of the reproducibility of articles published in the Journal of Memory and Language under the open data policy. Journal of Memory and Language, 125 1, Article 104332. DOI logoGoogle Scholar
Lee, D. Y. W. (2000). Modelling variation in spoken and written language: The multi-dimensional approach revisited [Unpublished doctoral dissertation]. Lancaster University.
Lundberg, I., Johnson, R., & Stewart, B. M. (2021). What is your estimand? Defining the target quantity connects statistical evidence to theory. American Sociological Review, 86 (3), 532–565. DOI logoGoogle Scholar
McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and STAN (2nd ed.). Chapman; Hall/CRC. DOI logoGoogle Scholar
McEnery, T., & Brezina, V. (2022). Fundamental principles of corpus linguistics (1st ed.). Cambridge University Press. DOI logoGoogle Scholar
McEnery, T., & Hardie, A. (2011). Corpus linguistics: Method, theory and practice. Cambridge University Press. DOI logoGoogle Scholar
Meehl, P. E. (1967). Theory-testing in psychology and physics: A methodological paradox. Philosophy of Science, 34 (2), 103–115. DOI logoGoogle Scholar
Mehl, S. (2021). What we talk about when we talk about corpus frequency: The example of polysemous verbs with light and concrete senses. Corpus Linguistics and Linguistic Theory, 17 (1), 223–247. DOI logoGoogle Scholar
National Academies of Sciences, Engineering, and Medicine. (2019). Reproducibility and replicability in science. National Academies Press. DOI logoGoogle Scholar
Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A., Corker, K. S., Dreber, A., Fidler, F., Hilgard, J., Kline Struhl, M., Nuijten, M. B., Rohrer, J. M., Romero, F., Scheel, A. M., Scherer, L. D., Schönbrodt, F. D., & Vazire, S. (2022). Replicability, robustness, and reproducibility in psychological science. Annual Review of Psychology, 73 1, 719–748. DOI logoGoogle Scholar
Nuijten, M. B., Hartgerink, C. H. J., van Assen, M. A. L. M., Epskamp, S., & Wicherts, J. M. (2016). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 48 1, 1205–1226. DOI logoGoogle Scholar
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349 (6251), aac4716. DOI logoGoogle Scholar
Pedersen, T. (2008). Empiricism is not a matter of faith. Computational Linguistics, 34 (3), 465–470. DOI logoGoogle Scholar
Peikert, A., & Brandmaier, A. M. (2021). A reproducible data analysis workflow with R Markdown, Git, Make, and Docker. Quantitative and Computational Methods in Behavioral Sciences, 1 1, Article e3763. DOI logoGoogle Scholar
Peng, R. D., & Hicks, S. C. (2021). Reproducible research: A retrospective. Annual Review of Public Health, 42 1, 79–93. DOI logoGoogle Scholar
Phillips, M. (1985). Aspects of text structure: An investigation of the lexical organisation of text. North-Holland.Google Scholar
Pietschnig, J., Siegel, M., Eder, J. S. N., & Gittler, G. (2019). Effect declines are systematic, strong, and ubiquitous: A meta-meta-analysis of the decline effect in intelligence research. Frontiers in Psychology, 10 1, Article 2874. DOI logoGoogle Scholar
Porte, G., & McManus, K. (2018). Doing replication research in applied linguistics (1st ed.). Routledge. DOI logoGoogle Scholar
Rastle, K. (2022). Improving reproducibility in the Journal of Memory and Language. Journal of Memory and Language, 126 1, Article 104351. DOI logoGoogle Scholar
Schützler, O., & Schlüter, J. (Eds.). (2022). Data and methods in corpus linguistics: Comparative approaches [Supplemental material]. Cambridge University Press. [URL]. DOI logo
Sönning, L. (2024). Evaluation of keyness metrics: Performance and reliability. Corpus Linguistics and Linguistic Theory, 20 (1), 263–288. DOI logoGoogle Scholar
Sönning, L., & Grafmiller, J. (2024). Seeing the wood for the trees: Predictive margins for random forests. Corpus Linguistics and Linguistic Theory, 20 (1), 153–181. DOI logoGoogle Scholar
Sönning, L., & Krug, M. (2022). Comparing study designs and down-sampling strategies in corpus analysis: The importance of speaker metadata in the BNCs of 1994 and 2014. In O. Schützler & J. Schlüter (Eds.), Data and methods in corpus linguistics: Comparative approaches (pp. 127–160). Cambridge University Press. DOI logoGoogle Scholar
Sönning, L., & Werner, V. (2021a). The replication crisis, scientific revolutions, and linguistics. Linguistics, 59 (5), 1179–1206. DOI logoGoogle Scholar
(Eds.). (2021b). The replication crisis: Implications for linguistics [Special issue]. Linguistics, 59 (5). [URL]
Spence, J. R., & Stanley, D. J. (2016). Prediction interval: What to expect when you’re expecting … a replication. PLOS ONE, 11 (9), Article e0162874. DOI logoGoogle Scholar
Staudte, R. G., & Sheather, S. J. (1990). Robust estimation and testing (1st ed.). Wiley. DOI logoGoogle Scholar
Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing transparency through a multiverse analysis. Perspectives on Psychological Science, 11 (5), 702–712. DOI logoGoogle Scholar
Stefanowitsch, A. (2020). Corpus linguistics: A guide to the methodology. Language Science Press. DOI logoGoogle Scholar
Sterling, T. D. (1959). Publication decisions and their possible effects on inferences drawn from test of significance — or vice versa. Journal of the American Statistical Association, 54 (285), 30–34. DOI logoGoogle Scholar
Stodden, V., Seiler, J., & Ma, Z. (2018). An empirical analysis of journal policy effectiveness for computational reproducibility. Proceedings of the National Academy of Sciences, 115 (11), 2584–2589. DOI logoGoogle Scholar
Stubbs, M. (2001). Words and phrases: Corpus studies of lexical semantics. Blackwell.Google Scholar
Szmrecsanyi, B., Biber, D., Egbert, J., & Franco, K. (2016). Toward more accountability: Modeling ternary genitive variation in Late Modern English. Language Variation and Change, 28 (1), 1–29. DOI logoGoogle Scholar
Trisovic, A., Lau, M. K., Pasquier, T., & Crosas, M. (2022). A large-scale study on research code quality and execution. Scientific Data, 9 (1), 60. DOI logoGoogle Scholar
Vanpaemel, W., Vermorgen, M., Deriemaecker, L., & Storms, G. (2015). Are we wasting a good crisis? The availability of psychological research data after the storm. Collabra, 1 (1), 3. DOI logoGoogle Scholar
Vasishth, S., & Gelman, A. (2021). How to embrace variation and accept uncertainty in linguistic and psycholinguistic data analysis. Linguistics, 59 (5), 1311–1342. DOI logoGoogle Scholar
Vetter, F. (2021). Issues of corpus comparability and register variation in the International Corpus of English: Theories and computer applications [Doctoral dissertation, Otto-Friedrich-Universität]. DOI logo
Wallis, S. (2017, February 16). The replication crisis: What does it mean for corpus linguistics? corp.ling.stats: statistics for corpus linguistics. [URL]
(2019). Comparing χ 2 tables for separability of distribution and effect: Meta-tests for comparing homogeneity and goodness of fit contingency test outcomes. Journal of Quantitative Linguistics, 26 (4), 330–355. DOI logoGoogle Scholar
(2020). Statistics in corpus linguistics research: A new approach (1st ed.). Routledge. DOI logoGoogle Scholar
(2022). Accurate confidence intervals on Binomial proportions, functions of proportions, algebraic formulae and effect sizes. [URL]
Wallis, S., & Mehl, S. (2022). Comparing baselines for corpus analysis: Research into the get-passive in speech and writing. In O. Schützler & J. Schlüter (Eds.), Data and methods in corpus linguistics: Comparative approaches (1st ed., pp. 101–126). Cambridge University Press. DOI logoGoogle Scholar
Whitaker, K. (2017, September 26). Publishing a reproducible paper [Conference presentation]. Open science in practice summer school, Lausanne, Switzerland. DOI logo
Wieling, M., Rawee, J., & van Noord, G. (2018). Reproducibility in computational linguistics: Are we willing to share? Computational Linguistics, 44 (4), 641–649. DOI logoGoogle Scholar
Wilcox, R. R. (2013). Introduction to robust estimation and hypothesis testing (3rd ed.). Academic Press. DOI logoGoogle Scholar
Wilson, G., Bryan, J., Cranston, K., Kitzes, J., Nederbragt, L., & Teal, T. K. (2017). Good enough practices in scientific computing. PLOS Computational Biology, 13 (6). Article e1005510. DOI logoGoogle Scholar
Yarkoni, T. (2022). The generalizability crisis. Behavioral and Brain Sciences, 45 (e1). DOI logoGoogle Scholar
Young, C. (2018). Model uncertainty and the crisis in science. Socius: Sociological Research for a Dynamic World, 4 1. DOI logoGoogle Scholar
Young, C., & Holsteen, K. (2017). Model uncertainty and robustness: A computational framework for multimodel analysis. Sociological Methods & Research, 46 (1), 3–40. DOI logoGoogle Scholar