Article published in:
New Approaches to English Linguistics: Building bridgesEdited by Olga Timofeeva, Anne-Christine Gardner, Alpo Honkapohja and Sarah Chevalier
[Studies in Language Companion Series 177] 2016
► pp. 281–320
Statistical sequence and parsing models for descriptive linguistics and psycholinguistics
Gintaré Grigonyté | University of Stockholm
This study shows that using computational linguistic models is beneficial for descriptive linguistics and psycholinguistics. It applies two models to various English genres and learner language: 1) surprisal and 2) a syntactic parser, allowing us to investigate the role of ambiguity and the interplay between idiom and syntax principles. We find that surprisal and ambiguity are higher for learner language, while parser scores and model fit are lower. In addition, the random application of alternations leads to more ambiguous sentences. Failures to generate optimal orderings in the sense of relevance theory, such as nonnative-like utterances by language learners exhibit, increase processing load, both for human and automatic processors. As human and automatic parsing difficulties correlate, we suggest syntactic parsers as psycholinguistic processing models.
Keywords: language processing, statistical models, idiom and syntax principle, ambiguity, syntactic parsing
Published online: 01 November 2016
https://doi.org/10.1075/slcs.177.11sch
https://doi.org/10.1075/slcs.177.11sch
References
References
Altenberg, Bengt & Tapper, Marie
Arppe, Antti, Gilquin, Gaëtanelle, Glynn, Dylan, Hilpert, Martin & Zeschel, Arne
Behaghel, Otto
Borensztajn, Gideon, Zuidema, Willem & Bod, Rens
Bod, Rens, Scha, Remko & Sima’an, Khalil
Bresnan, Joan, Cueni, Anna, Nikitina, Tatiana & Baayen, Harald
Bresnan, Joan & Nikitina, Tatiana
Buchholz, Sabine
Carroll, John, Minnen, Guido & Briscoe, Edward
Collins, Michael
Conklin, Kathy & Schmitt, Norbert
Church, Kenneth
Demberg, Vera, Keller, Frank & Alexander Koller
Ellis, Nick C.
Evert, Stefan
Federico, Marcello & Cettolo, Mauro
2007 Efficient handling of N-gram language models for statistical machine translation. In Proceedings of the Second Workshop on Statistical Machine Translation, Chris Callison-Burch, Philipp Koehn, Christof Monz, & Cameron Shaw Fordyce (eds), 88–95. Prague: Association for Computational Linguistics. 

Francis, Gill
Granger, Sylviane
Green, Matthew J.
2014 An eye-tracking evaluation of some parser complexity metrics. In Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), Sandra Williams, Advaith Siddharthan & Anni Nenkova (eds), 38–46. Stroudsburg PA: Association for Computational Linguistics.
Grice, Paul
Gries, Stefan T.
Hundt, Marianne, Schneider, Gerold & Seoane, Elena
Hunston, Susan & Francis, Gill
Izumi, Emi, Uchimoto, Kiyotaka & Isahara, Hitoshi
Ishikawa, Shin
2009 Vocabulary in interlanguage: A study on corpus of English essays written by Asian university students (CEEAUS). In Phraseology, Corpus Linguistics and Lexicography: Papers from Phraseology 2009 in Japan, Katsumasa Yagi & Takaaki Kanzaki (eds), 87–100. Nishinomiya: Kwansei Gakuin University Press.
Jaeger, Tim Florian
Jucker, Andreas H.
Keller, Frank
Kreyer, Rolf
Koehn, Philipp & Hoang, Hieu
Labov, William
Leech, Geoffrey, Hundt, Marianne, Mair, Christian & Smith, Nicholas
Lehmann, Hans Martin & Schneider, Gerold
2012 Syntactic variation and lexical preference in the dative-shift alternation. In Studies in Variation, Contacts and Change in English, Papers from the 31st International conference on English language research on computerized corpora (ICAME 31) Giessen, Germany, Joybrato Mukherjee & Magnus Huber (eds), 65–75. Amsterdam: Rodopi.
Levin, Beth C.
Levy, Roger & Jaeger, T. Florian
Marcus, Mitch, Santorini, Beatrice & Marcinkiewicz, Mary Ann
Mariño, José, Banches, Rafael E., Crego, Josep M., de Gispert, Adrià Lambert, Patrik, Fonollosa, José A. R. & Costa-jussà, Marta R.
Mel’čuk, Igor
Meseguer, Enrique, Carreiras, Manuel & Clifton, Charles
Millar, Neil
Mukherjee, Joybrato
Ng, Hwee Tou, Wu, Siew Mei, Briscoe, Ted, Hadiwinoto, Christian, Susanto, Raymond Hendy & Bryant, Christopher
(eds) 2014 Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. http://acl2014.org/acl2014/W14–17/ (12 February 2016). 
Pawley, Andrew & Syder, Frances Hodgetts
Rohdenburg, Günter & Mondorf, Britta
Rosenbach, Anette
Röthlisberger, Melanie & Schneider, Gerold
2013 Of-genitive versus s-genitive: A corpus-based analysis of possessive constructions in 20th-century English. In New Methods in Historical Corpora [Corpus Linguistics and Interdisciplinary Perspectives on Language 3], Paul Bennet, Martin Durrell, Silke Scheible & Richard J. Whitt (eds), 163–180. Tübingen: Narr.
Sankoff, David
Schneider, Gerold, Rinaldi, Fabio, Kaljurand, Kaarel & Hess, Michael
Schneider, Gerold
Schneider, Gerold & Hundt, Marianne
Seidenberg, Mark & MacDonald, Maryellen
Sennrich, Rico
Seoane, Elena
Siyanova-Chanturia, Anna & Martinez, Ron
Sperber, Dan & Wilson, Deirdre
Szmrecsanyi, Benedikt
Tomasello, Michael
Wasow, Thomas & Arnold, Jennifer