Tomaž Erjavec | Department of Intelligent Systems, Jozef Stefan Institute, Ljubljana, Slovenia
The paper presents an annotated parallel Slovene-English corpus developed in the scope of the EU ELAN project. The IJS-ELAN corpus was compiled to be a widely distributable dataset for language engineering and for translation and terminology studies. The corpus contains 1 million words from fifteen recent terminology-rich texts. The corpus is sentence aligned and word-tagged with context disambiguated morphosyntactic descriptions and lemmas. These descriptions model simple feature structures, the structure of which is shared between Slovene and English. The corpus is encoded according to the Guidelines for Text Encoding and Interchange and is freely available on the Web for downloading. Additionally, access to IJS-ELAN is available via a powerful Web concordancer.
2005. Unsupervised Learning of Multiword Units from Part-of-Speech Tagged Corpora: Does Quantity Mean Quality?. In Progress in Artificial Intelligence [Lecture Notes in Computer Science, 3808], ► pp. 669 ff.
ERJAVEC, TOMAŽ & SASČO DŽEROSKI
2004. MACHINE LEARNING OF MORPHOSYNTACTIC STRUCTURE: LEMMATIZING UNKNOWN SLOVENE WORDS. Applied Artificial Intelligence 18:1 ► pp. 17 ff.
2010. Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering, ► pp. 29 ff.
Rai, Pooja & Sanjay Chatterji
2023. Annotation Projection-based Dependency Parser Development for Nepali. ACM Transactions on Asian and Low-Resource Language Information Processing 22:2 ► pp. 1 ff.
Žganec-Gros, Jerneja & Stanislav Gruden
2008. MSD Recombination for Statistical Machine Translation into Highly-Inflected Languages. In Text, Speech and Dialogue [Lecture Notes in Computer Science, 5246], ► pp. 235 ff.
Žganec-Gros, Jerneja, France Mihelič, Tomaž Erjavec & Špela Vintar
2005. The VoiceTRAN Speech-to-Speech Communicator. In Text, Speech and Dialogue [Lecture Notes in Computer Science, 3658], ► pp. 379 ff.
This list is based on CrossRef data as of 29 february 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.