Article published in:Text Corpora and Multilingual Lexicography
Edited by Wolfgang Teubert
[Benjamins Current Topics 8] 2007
► pp. 93–107
Procedures in building the Croatian-English parallel corpus
This contribution gives a survey of procedures and formats used in building the Croatian-English parallel corpus which is being collected at the Institute of Linguistics at the Philosophical Faculty, University of Zagreb. The primary text source is the newspaperCroatia Weekly which has been published from the beginning of 1998 by HIKZ (Croatian Institute for Information and Culture). After a quick survey of existing English-Croatian parallel corpora, the article copes with procedures involved in text conversion and text encoding, particularly the alignment. There are several recent suggestions for alignment encoding, and they are listed and elaborated at the end of the article.
Published online: 27 June 2007
Cited by 1 other publications
Basic, Bojana Dalbelo, Zdravko Dovedan, Ida Raffaelli, Sanja Seljan & Marko Tadic
This list is based on CrossRef data as of 28 september 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.