Article published in:Orthographic Databases and Lexicons
Edited by Lynne Cahill and Terry Joyce
[Written Language & Literacy 20:1] 2017
► pp. 104–127
What are the “phonemes” in phoneme-grapheme mappings?
A perspective on the use of databases for lexicon development
The CELEX lexical database (Baayen, Piepenbrock & van Rijn 1995) was developed in the 1990s, providing a database of the syntactic, morphological, phonological and orthographic forms of between 50,000 and 125,000 words of Dutch, English and German. This database was used as the basis for the development of the PolyLex lexicons, which included syntactic, morphological and phonological information for around 3,000 words of Dutch, English and German. Orthographic information was subsequently added in the PolyOrth project. The PolyOrth project was based on the assumption that the underlying, lexical phonological forms could be used to derive the surface orthographic forms by means of a combination of phoneme-grapheme mappings and sets of autonomous spelling rules for each language. One of the complications encountered during the project was the fact that the phonological forms in CELEX were not always genuinely underlying forms which made deriving the orthographic forms tricky. This paper discusses the nature and status of underlying phonological forms, their relation to orthography and the issues of finding this information in databases.
Keywords: phoneme-grapheme mappings, lexical databases, lexicons, underlying phonology, lexical phonology, post-lexical phonology
Published online: 19 October 2017
Baayen, Harald, Richard Piepenbrock & Hedderik van Rijn
Benesty, Jacob, M. M. Sondhi, Yiteng Huang
Brown, Dunstan & Andrew Hippisley
(2001) Semi-automatic construction of multilingual lexicons. Machine Translation Review. Electronic journal available at www.bsc.org.uk/siggroup/nalantran/mtreview/mtr-12/.
Cahill, Lynne & Gerald Gazdar
Cahill, Lynne Carole Tiberius & Jon Herring
Evans, R., P. Piwek, L. Cahill & N. Tipper
Evertz, Martin & Beatrice Primus
Finkel, Raphael & Gregory Stump
Goldrick, Matthew & Brenda Rapp
New, Boris, Christophe Pallier, Marc Brysbaert & Dominic Ferrand