Given its multi-scriptal nature, the Japanese writing system can potentially yield some important insights into the complex relationships that can exist between units of language and units of writing. This paper discusses some of the difficult issues surrounding the notions of orthographic representation and variation within the Japanese writing system, as seen from the perspective of creating word lists based on the Kokuritsu Kokugo Kenkyūjo’s ‘Balanced Corpus of Contemporary Written Japanese’ (BCCWJ) Project. More specifically, the paper (i) reflects on the treatment of lemmas within UniDic, the morphological analyzer dictionary developed for the project, (ii) notes some concerns for extracting word lists that stem from the project’s approach towards defining orthographic words which draws on its conceptualization of short and long unit words, and (iii) attempts to quantify the extent of orthographic variation within the Japanese writing system as represented by the BCCWJ. Keywords: Japanese; Balanced Corpus of Contemporary Written Japanese (BCCWJ); kanji; hiragana; katakana; orthographic variation; UniDic
2020. Unspeakable puns: kanji-dependent wordplay as a localization strategy in Japanese. Perspectives 28:4 ► pp. 606 ff.
Robertson, Wesley C.
2017. He's more katakana than kanji: Indexing identity and self‐presentation through script selection in Japanese manga (comics). Journal of Sociolinguistics 21:4 ► pp. 497 ff.
Robertson, Wesley C. & Tamaki Mihic
2022. Introduction to Special Issue on Writing-Restricted Variation in Japanese. Japanese Studies 42:1 ► pp. 1 ff.
This list is based on CrossRef data as of 29 october 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.