Languages left behind
Keeping Taiwanese off the World Wide Web
The Unicode standard is an enormous step toward realizing the goal of a single computer encoding scheme for virtually all of the world’s scripts. Although not all computers will necessarily have the type fonts to print all characters, at least all computers will be able to recognize what characters are required for proper display of text in almost any language. However the Unicode standard presupposes that each language has a script consisting of a finite number of agreed-upon characters. Some languages still lack such agreement. As planning has gone forward for Unicode, more and more code points are being assigned, leaving ever less conveniently accessed code points for future expansion. This article describes the Unicode project. Then it describes the special challenge of encoding Chinese characters. Finally it uses the example of Hokkien, a “dialect” of Chinese spoken by most people in Taiwan, to explore the problem of unorthodox, unstable, or unofficial scripts. Political forces and technical considerations make it difficult to include such scripts in Unicode. As Unicode becomes the “de facto” standard for writing human languages, script innovations will presumably become less and less likely to receive wide use.
Cited by other publications
This list is based on CrossRef data as of 04 january 2020. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.