The increased globalization of science and technology and the growing number of bilinguals and multilinguals in the world have made research with multiple languages a mainstay for scholars who study human function and especially those who focus on language, cognition, and the brain. Such research can benefit from large-scale databases and online resources that describe and measure lexical, phonological, orthographic, and semantic information. The present paper discusses currently-available resources and underscores the need for tools that enable measurements both within and across multiple languages. A general review of language databases is followed by a targeted introduction to databases of orthographic and phonological neighborhoods. A specific focus on CLEARPOND illustrates how databases can be used to assess and compare neighborhood information across languages, to develop research materials, and to provide insight into broad questions about language. As an example of how using large-scale databases can answer questions about language, a closer look at neighborhood effects on lexical access reveals that not only orthographic, but also phonological neighborhoods can influence visual lexical access both within and across languages. We conclude that capitalizing upon large-scale linguistic databases can advance, refine, and accelerate scientific discoveries about the human linguistic capacity.
(1992) Frequency and neighborhood effects on lexical access: Lexical similarity or orthographic redundancy?Journal of Experimental Psychology: Learning, Memory, and Cognition 18(2): 234.
Baayen, R. Harald, Richard Piepenbrock & Léon Gulikers
(1995) The CELEX lexical database (CD-ROM). Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania.
Bartolotti, James & Viorica Marian
(2017) Bilinguals’ existing languages benefit vocabulary learning in a third language. Language Learning 67(1): 110–140.
Brysbaert, Marc, Matthias Buchmeier, Markus Conrad, Arthur M. Jacobs, Jens Bölte & Andrea Böhl
(2011) The word frequency effect: a review of recent developments and implications for the choice of frequency estimates in German. Experimental Psychology 58(5): 412.
Brysbaert, Marc & Boris New
(2009) Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior research methods 41(4): 977–990.
Brysbaert, Marc, Evelyne Lagrou & Michael Stevens
(2017) Visual word recognition in a second language: A test of the lexical entrenchment hypothesis with lexical decision times. Bilingualism: Language and Cognition. 20(3): 530–548.
Brysbaert, Marc, Michaël Stevens, Simon De Deyne, Wouter Voorspoels & Gert Storms
(2014) Norms of age of acquisition and concreteness for 30,000 Dutch words. Acta psychological 1501: 80–84.
Brysbaert, Marc, Michael Stevens, Pawel Mandera & Emmanuel Keuleers
(2016) The impact of word prevalence on lexical decision times: Evidence from the Dutch Lexicon Project 2. Journal of Experimental Psychology: Human Perception and Performance 42(3), 441–458.
Cai, Qing & Marc Brysbaert
(2010) SUBTLEX-CH: Chinese word and character frequencies based on film subtitles. PLoS One 5(6): e10729.
Coltheart, Max
(1981) The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology 33(4): 497–505.
Cuetos, Fernando, Maria Glez-Nosti, Analía Barbón & Marc Brysbaert
(2011) SUBTLEX-ESP: Spanish word frequencies based on film subtitles. Psicologica 321: 133–143.
(2005) N-Watch: A program for deriving neighborhood size and other psycholinguistic statistics. Behavior research methods 37(1): 65–70.
Davis, Colin J. & Manuel Perea
(2005) BuscaPalabras: A program for deriving orthographic and phonological neighborhood statistics and other psycholinguistic indices in Spanish. Behavior Research Methods 37 (4): 665–671.
De Deyne & Gert Storms
n.d.). Word association study. Retrieved from [URL]
de Groot, Annette M. B., Susanne Borgwaldt, Mieke Bos & Ellen van den Eijnden
(2002) Lexical decision and word naming in bilinguals: Language effects and task effects. Journal of Memory and Language 47(1): 91–124.
Dimitropoulou, Maria, Jon Andoni Duñabeitia, Alberto Avilés, José Corral & Manuel Carreiras
(2010) Subtitle-based word frequencies as the best estimate of reading behavior: The case of Greek. Frontiers in psychology 11: 218.
Duyck, Wouter, Timothy Desmet, Lieven P. C. Verbeke & Marc Brysbaert (2004) WordGen: A tool for word selection and nonword generation in Dutch, English, German, and French. Behavior Research Methods, Instruments, & Computers 361: 488–499
Frisch, Stefan A., Nathan R. Large & David B. Pisoni
(2000) Perception of wordlikeness: Effects of segment probability and length on the processing of nonwords. Journal of memory and language 42(4): 481–496.
Grainger, Jonathan, Mathilde Muneaux, Fernand Farioli & Johannes C. Ziegler
(2005) Effects of phonological and orthographic neighbourhood density interact in visual word recognition. The Quarterly Journal of Experimental Psychology Section A 58(6): 981–998.
Grainger, Jonathan
(1990) Word frequency and neighborhood frequency effects in lexical decision and naming. Journal of Memory and Language 291: 228–244.
(2010) SUBTLEX-NL: A new measure for Dutch word frequency based on film subtitles. Behavior research methods 42(3): 643–650.
Kiss, George R., Christine Armstrong, Robert Milroy & James Piper
(1973) An associative thesaurus of English and its computer analysis. In Adam Jack Aitken & Richard W. Bailey (eds.), The computer and literary studies, 153–165. Edinburgh: University Press.
Kuperman, Victor, Hans Stadthagen-Gonzalez & Marc Brysbaert
(2012) Age-of-acquisition ratings for 30,000 English words. Behavior Research Methods 44(4): 978–990.
Luce, Paul A. & David B. Pisoni
(1998) Recognizing spoken words: The neighborhood activation model. Ear and hearing 19(1): 1.
Luce, Paul A. & Nathan R. Large
(2001) Phonotactics, density, and entropy in spoken word recognition. Language and Cognitive Processes 161: 565–581.
MacWhinney, Brian
(2000) The CHILDES Project: Tools for Analyzing Talk. Mahwah, NJ: Lawrence Erlbaum Associates.
Marian, Viorica & Henrike Blumenfeld
(2006) Phonological neighborhood density guides lexical access in native and non-native language production. Journal of Social and Ecological Boundaries 21: 3–35.
Marian, Viorica, James Bartolotti, Sarah Chabal & Anthony Shook
(2012) CLEARPOND: Cross-linguistic easy-access resource for phonological and orthographic neighborhood densities. PloS one 7(8): e43230.
McRae, Ken, George S. Cree, Mark S. Seidenberg & Chris McNorgan
(2005) Semantic feature production norms for a large set of living and nonliving things. Behavior research methods 37(4): 547–559.
Nelson, Douglas L., Cathy L. McEvoy & Thomas A. Schreiber
(2004) The University of South Florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers 36(3): 402–407.
New, Boris, Christophe Pallier, Marc Brysbaert & Ludovic Ferrand
(2004) Lexique 2: A new French lexical database. Behavior Research Methods, Instruments, & Computers 36(3): 516–524.
Nusbaum, Howard C., David B. Pisoni & Christopher K. Davis
(1984) Sizing up the Hoosier mental lexicon: Measuring the familiarity of 20,000 words. (Progress Report No. 10; pp. 357–376). Bloomington, IN: Speech Research Laboratory, Indiana University.
Roodenrys, Steven & Melinda Hinton (2002) Sublexical or lexical effects on serial recall of nonwords?Journal of Experimental Psychology: Learning, Memory, and Cognition 28(1): 29.
Shaoul, Cyrus & Chris Westbury
(2010) Exploring lexical co-occurrence space using HiDEx. Behavior Research Methods 42(2): 393–413.
Storkel, Holly L., Jonna Armbruster & Tiffany P. Hogan
(2006) Differentiating phonotactic probability and neighborhood density in adult word learning. Journal of Speech, Language, and Hearing Research 49 (6): 1175–1192.
(2005) Long-term knowledge effects on serial recall of nonwords are not exclusively lexical. Journal of Experimental Psychology: Learning, Memory, and Cognition 31(4): 729.
Tiedemann, Jörg, & Lars Nygaard
(2004) The OPUS Corpus-Parallel and Free. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04). Lisbon, Portugal.
Tsai, Jie-Li, Chia-Ying Lee, Ying-Chun Lin, Ovid J. L. Tzeng & Daisy L. Hung
(2006) Neighborhood size effects of Chinese words in lexical decision and reading. Language and Linguistics 7(3): 659–675.
Van Heuven, Walter, Dijkstra, Ton, & Grainger, Jonathan
(1998) Orthographic neighborhood effects in bilingual word recognition. Journal of Memory and Language, 39(3): 458–483.
Vitevitch, Michael S. & Paul A. Luce
(1999) Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language 40(3): 374–408.
Vitevitch, Michael S. & Eva Rodríguez
(2006) Neighborhood density effects in spoken word recognition in Spanish. Journal of Multilingual Communication Disorders 3 (1): 64–73.
Vitevitch, Michael S. & Melissa K. Stamer
(2006) The curious case of competition in Spanish speech production. Language and cognitive processes 21(6): 760–770.
Washington University Speech and Hearing Lab database
(2008) Moving beyond Coltheart’s N: A new measure of orthographic similarity. Psychonomic Bulletin and Review, 15, 5, pp. 971–979.
Yates, Mark
(2005) Phonological neighbors speed visual word processing: evidence from multiple tasks. Journal of Experimental Psychology: Learning, Memory, and Cognition 31(6): 1385.
Yates, Mark, Lawrence Locker & Greg B. Simpson
(2004) The influence of phonological neighborhood on visual word perception. Psychonomic Bulletin & Review 11(3): 452–457.
Ziegler, Johannes C., Mathilde Muneaux & Jonathan Grainger
(2003) Neighborhood effects in auditory word recognition: Phonological competition and orthographic facilitation. Journal of Memory and Language 48(4): 779–793.
Cited by
Cited by 4 other publications
Costa, Ana Santos, Montserrat Comesaña & Ana Paula Soares
2022. PHOR-in-One: A multilingual lexical database with PHonological, ORthographic and PHonographic word similarity estimates in four languages. Behavior Research Methods 55:7 ► pp. 3699 ff.
Hayakawa, Sayuri, Siqi Ning & Viorica Marian
2020. From Klingon to Colbertian: Using Artificial Languages to Study Word Learning. Bilingualism: Language and Cognition 23:1 ► pp. 74 ff.
Rigoulot, Simon, Xiaoming Jiang, Nikos Vergis & Marc D. Pell
2020. Neurophysiological correlates of sexually evocative speech. Biological Psychology 154 ► pp. 107909 ff.
van der Burght, Constantijn L., Angela D. Friederici, Matteo Maran, Giorgio Papitto, Elena Pyatigorskaya, Joëlle A. M. Schroën, Patrick C. Trettenbrein & Emiliano Zaccarella
2023. Cleaning up the Brickyard: How Theory and Methodology Shape Experiments in Cognitive Neuroscience of Language. Journal of Cognitive Neuroscience 35:12 ► pp. 2067 ff.
This list is based on CrossRef data as of 15 april 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.