
Publication details [#10804]

Beesley, Kenneth R. 1988. Language identifier: a computer program for automatic natural language identification of on-line text. In Lindberg Hammond, Deanna, ed. Languages at crossroads. Medford: Learned Information. pp. 47–54.


The first step in translating any text is to identify the language in which it is written. Several useful methods have appeared for language identification where the mystery texts are properly spelled and accented paper documents. Unfortunately, in machine-translation environments, wher texts are on-line and may exhibit a variety of conventions for character-mapping and accentuation, the problem is far more difficult. This paper outline a generalized approach to language identification of on-line text based on techniques of cryptanalysis.
Source : Based on abstract in book