Compiling parallel text corpora
Towards automation of routine procedures
The aim of the research project running at the Department of Translation Studies of the University of Tampere is to collect a Russian-Finnish parallel corpus of fiction. The corpus will be equipped with efficient search and analysis tools. The texts of the corpus will be stored as ordinary text files. Each text will be registered in a Microsoft Access database and supplied with a description. Automated parallel concordancing is being developed for the corpus. The program will find the keywords in text A (Russian), then look for possible translation equivalents of the keywords in language B (Finnish), and then search for the portion of text B (Finnish) where most of the keywords in question can be found.