Publications

Publication details [#13609]

Abstract

In this article, the author present a method for the automatic extraction of aligned segments as a way to speed-up the process of translation-memory development. Parallel documents are captured from bilingual sites in the Web and processed in two steps. First, relevant information is extracted from downloaded files and transformed into a simplified TEI format. In a second step, parallel TEI files are converted into single files in TMX, which contain large collections of translation segments in a format that allows for the automatic feeding of translation memory tools.
Source : Based on abstract in book