Article published in:
International Journal of Corpus Linguistics
Vol. 24:1 (2019) ► pp. 98130

[ p. 124 ]References

Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., & Gildea, D.
(1999) Forms of English function words-effects of disfluencies, turn position, age and sex, and predictability. In J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville & A. C. Bailey (Eds.), Proceedings of ICPHS-99 (pp. 395–398). Berkley, CA: University of California. Retrieved from https://​www​.internationalphoneticassociation​.org​/icphs​-proceedings​/ICPhS1999​/papers​/p14​_0395​.pdf (last accessed February 2019).
Van Berkum, J. J., Brown, C. M., Zwitserlood, P., Kooijman, V., & Hagoort, P.
(2005) Anticipating upcoming words in discourse: Evidence from ERPs and reading times. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31(3), 443–467.
Biber, D.
(1988) Variation Across Speech and Writing. New York, NY: Cambridge University Press. Crossref link
(1995) Dimensions of Register Variation: A Cross-linguistic Comparison. New York, NY: Cambridge University Press. Crossref link
Biber, D., & Conrad, S.
(2009) Register, Genre, and Style. New York, NY: Cambridge University Press. Crossref link
Chen, S. F., & Goodman, J.
(1999) An empirical study of smoothing techniques for language modeling. Computer Speech & Language, 13(4), 359–393. Crossref link
Church, K. W., & Gale, W. A.
(1995) Poisson mixtures. Natural Language Engineering, 1(2), 163–190. Crossref link
Denoual, E.
(2006) A method to quantify corpus similarity and its application to quantifying the degree of literality in a document. International Journal of Technology and Human Interaction, 2(1), 51–66. Crossref link
Ellis, N. C.
(2002) Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition, 24(2), 143–188. Crossref link
Frisson, S., Rayner, K., & Pickering, M. J.
(2005) Effects of contextual predictability and transitional probability on eye movements during reading. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31(5), 862–877.
Van Gijsel, S., Speelman, D., & Geeraerts, D.
(2006) Locating lexical richness: A corpus linguistic, sociovariational analysis. In J. M. Viprey (Eds.), Proceedings of the 8th International Conference on the Statistical Analysis of Textual Data (pp. 961–971). Besançon: Presses universitaires de Franche-Comté. Retrieved from http://​lexicometrica​.univ​-paris3​.fr​/jadt​/jadt2006​/PDF​/II​-085​.pdf (last accessed February 2019).
Goedertier, W., Goddijn, S. M., & Martens, J. P.
(2000) Orthographic transcription of the Spoken Dutch Corpus. In N. Calzolari, G. Carayannis, K. Choukri, H. Höge, B. Maegaard, J. Mariani, & A. Zampolli (Eds.), Proceedings of LREC-2000. Athens: ELRA. Retrieved from http://​www​.lrec​-conf​.org​/proceedings​/lrec2000​/pdf​/87​.pdf (last accessed February 2019).
Van Gompel, M., & van den Bosch, A.
(2016) Efficient n-gram, skipgram and flexgram modelling with Colibri Core. Journal of Open Research Software, 4(1), 1–10.
Gries, S. Th.
(2001) A corpus linguistic analysis of English ic vs ical adjectives. ICAME Journal, 25, 65–108.
[ p. 125 ]
Gries, S. Th., & Ellis, N. C.
(2015) Statistical measures for usage-based linguistics. Language Learning, 65(1), 228–255. Crossref link
Hlaváčová, J., & Rychlý, P.
(1999) Dispersion of words in a language corpus. In V. Matousek, P. Mautner, J. Ocelíková, P. Sojka (Eds.), Text, Speech and Dialogue: Second International Workshop, TSD’99 Plzen, Czech Republic, September 13–17, 1999 Proceedings (pp. 321–324). Berlin: Springer. Crossref link
Jurafsky, D., & Martin, J. H.
(2009) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (2nd ed.). Upper Saddle River, NJ: Pearson.
Kilgarriff, A.
(2001) Comparing corpora. International Journal of Corpus Linguistics, 6(1), 97–133. Crossref link
Lee, D. Y.
(2001) Genres, registers, text types, domains and styles: Clarifying the concepts and navigating a path through the BNC jungle. Language Learning and Technology, 5(3), 37–72.
Leech, G.
(2000) Grammars of spoken English: New outcomes of corpus-oriented research. Language Learning, 50(4), 675–724. Crossref link
Marco, J.
(2000) Register analysis in literary translation: A functional approach. Fédération International des Traucteurs (FIT) Revue Babel, 46(1), 1–19.
Miller, D., & Biber, D.
(2015) Evaluating reliability in quantitative vocabulary studies: The influence of corpus design and composition. International Journal of Corpus Linguistics, 20(1), 30–53. Crossref link
Monsalve, I. F., Frank, S. L., & Vigliocco, G.
(2012) Lexical surprisal as a general predictor of reading time. In W. Daelemans (Eds.), Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (pp. 398–408). Avignon: Association for Computational Linguistics. Retrieved from http://​aclweb​.org​/anthology​/E12​-1041 (last accessed February 2019).
Oostdijk, N.
(2001) The design of the Spoken Dutch Corpus. Language and Computers, 36(1), 105–112.
Oostdijk, N., Reynaert, M., Hoste, V., & Schuurman, I.
(2013) The construction of a 500-million-word reference corpus of contemporary written Dutch. In P. Spyns & J. Odijk (Eds.), Essential Speech and Language Technology for Dutch (pp. 219–247). Berlin: Springer. Crossref link
Pluymaekers, M., Ernestus, M., & Baayen, R. H.
(2006) Effects of word frequency on the acoustic durations of affixes. In Proceedings of Interspeech 2006 – ICSLP (pp. 953–956). Pittsburgh, PA: International Speech Communication Association. Retrieved from https://​www​.isca​-speech​.org​/archive​/archive​_papers​/interspeech​_2006​/i06​_1241​.pdf (last accessed February 2019).
Rayson, P., & Garside, R.
(2000) Comparing corpora using frequency profiling. In A. Kilgarriff & T. Berber Sardinha (Eds.), Proceedings of the Workshop on Comparing Corpora of ACL 2000 (pp. 1–6). Hong Kong: Association for Computational Linguistics. Retrieved from https://​www​.aclweb​.org​/anthology​/W​/W00​/W00​-0901​.pdf (last accessed February 2019). Crossref link
Savický, P., & Hlavácová, J.
(2002) Measures of word commonness. Journal of Quantitative Linguistics, 9(3), 215–231. Crossref link
Schmitt, N.
(2010) Researching Vocabulary: A Vocabulary Research Manual. New York, NY: Palgrave Macmillan. Crossref link
[ p. 126 ]
Smith, N. J., & Levy, R.
(2013) The effect of word predictability on reading time is logarithmic. Cognition, 128(3), 302–319. Crossref link
Van Son, R., Wesseling, W., Sanders, E., & van den Heuvel, H.
(2008) The IFADV Corpus: A Free Dialog Video Corpus. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis & D. Tapias (Eds.), LREC (pp. 501–508). Marrakech: ELRA. Retrieved from http://​www​.lrec​-conf​.org​/proceedings​/lrec2008​/pdf​/132​_paper​.pdf (last accessed February 2019).
Stolcke, A.
(2002) SRILM-an extensible language modelling toolkit. In J. H. L. Hansen & B. L. Pellom (Eds.), Proceedings of the International Conference on Spoken Language Processing. Denver, CO: International Speech Communication Association. Retrieved from https://​www​.isca​-speech​.org​/archive​/archive​_papers​/icslp​_2002​/i02​_0901​.pdf (last accessed February 2019).
Tottie, G.
(1991) Negation in English Speech and Writing: A Study in Variation. San Diego, CA: Academic Press.
Willems, R. M., Frank, S. L., Nijhof, A. D., Hagoort, P., & van den Bosch, A.
(2016) Prediction during natural language comprehension. Cerebral Cortex, 26(6), 2506–2516. Crossref link
Witten, I. H., & Bell, T. C.
(1991) The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Transactions on Information Theory, 37(4), 1085–1094. Crossref link