Article published in:Terminology across Languages and Domains
Edited by Patrick Drouin, Natalia Grabar, Thierry Hamon and Kyo Kageura
[Terminology 21:2] 2015
► pp. 237–262
The Sociopolitical Thesaurus as a resource for automatic document processing in Russian
This paper presents the structure and current state of the Sociopolitical thesaurus, which was developed for automatic document analysis and information-retrieval applications in Russian in a broad domain of public affairs. The scope of the Sociopolitical thesaurus resembles traditional information-retrieval thesauri for broad domains such as the EUROVOC or UNBIS thesauri, but the Sociopolitical thesaurus is intended as a tool for automatic document processing and this difference leads to considerable distinctions in the thesaurus structure and principles of its development. The knowledge representation in the Sociopolitical thesaurus is based on the combination of three existing traditions of developing information-retrieval thesauri, wordnets, and formal ontology research, which facilitates the consistent representation for such a broad scope of concepts and automatic document analysis of unstructured texts. The Sociopolitical thesaurus is used in such applications as conceptual indexing in information-retrieval systems, knowledge-based text categorization, automatic summarization of single and multiple documents, and question-answering. This paper presents an evaluation of the Sociopolitical thesaurus in automatic knowledge-based text categorization.
Keywords: thesaurus, general lexicon, sociopolitical domain, text categorization, ontological dependence, multiword expressions, automatic document processing
Published online: 31 December 2015
Cited by 4 other publications
Galieva, Alfiia, Olga Nevzorova & Yuliana Elezarova
Galieva, Alfiya, Alexander Kirillovich, Bulat Khakimov, Natalia Loukachevitch, Olga Nevzorova & Dzhavdet Suleymanov
Lagutina, N. S., K. V. Lagutina, I. A. Shchitov & I. V. Paramonov
Shchitov, Ivan, Ksenia Lagutina, Nadezhda Lagutina, Ilya Paramonov & Andrey Vasilyev
This list is based on CrossRef data as of 24 april 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
Ageev, Mikhail, Boris Dobrov, Pavel Krasilnikov, Natalia Loukachevitch, Andrey Pavlov, Alexey Sidorov, and Sergey Shternov
2008 “UIS ROSSIYA na ROMIP-2007: poisk i classifikaciya [UIS RUSSIA at ROMIP-2007: Search and Classification].” In Proceedings of Russian Seminar on Information-Retrieval Methods, ROMIP 2007–2008 , 199–220, Dubna, Russia.
Aitchinson, Jean, and Alan Gilchrist
2008 “RussNet as a Computer Lexicon for Russian.” In Proceedings of the Intelligent Information Systems IIS-2008 , 341–350, Zacopane, Poland.
Balkova, Valentina, Andrey Suhonogov, and Sergey Yablonsky
2008 “Some Issues in the Construction of a Russian WordNet Grid.” In Proceedings of the Fourth International WordNet Conference , 44–55, Szeged, Hungary.
Bentivogli, Luisa, and Emanuele Pianta
2004 “Extending Wordnet with Syntagmatic Information.” In Proceedings of Second Global WordNet Conference , 47–53, Brno, Czech Republic.
Braslavski, Pavel, Dmitrii Ustalov, and Mikhail Mukhin
2014 “A Spinning Wheel for Yarn: User Interface for a Crowdsourced Thesaurus.” In Proceedings of EACL-2014 , 101–104, Gothenberg, Sweden.
http://www.dmoz.org/. Accessed 26 August, 2015.
Dobrov, Boris, and Natalia Loukachevitch
2006 “Development of Linguistic Ontology on Natural Sciences and Technology.” In Proceedings of Linguistic Resources and Evaluation Conference (LREC 2006) , 1077–1082, Genoa, Italy.
1995 “Vol. 1–3/European Communities.–Luxembourg: Office for Official Publications of the European Communities.”
2002 “Parallel Hierarchies in the Verb Lexicon.” In Proceedings of ‘The Ontologies and Lexical Knowledge Bases’ Workshop (OntoLex02), 27–31, Canary Islands, Spain.
Fillmore, Charles, Christopher Johnson, and Miriam R. Petruck
Gangemi, Aldo, Roberto Navigli, and Paula Velardi
2003 “The OntoWordNet Project: Extension and Axiomatisation of Conceptual Relations in Wordnet.” In Proceedings of International Conference on Ontologies, Databases and Applications of Semantics (ODBASE), 820–838, Catania, Sicily.
Gelfenbeyn, Ilia, Artem Goncharuk, Vlad Lehelt, Anton Lipatov, and Viktor Shilo
2003 “Automatic Translation of WordNet Semantic Network to Russian Language.” In Proceedings of International Conference on Computational Linguistics and Intellectual Technologies (Dialogue 2003) , 120–128, Moscow, Russia.
Gomez-Perez, Asuncion, Oscar Corcho, and Mariano Fernandez-Lopez
Grenon, Pierre, and Barry Smith
http://grnti.ru/. Accessed 26 August, 2015.
1998 “Some Ontological Principles for Designing Upper Level Lexical Resources.” In Proceedings of the First International Conference on Language Resources and Evaluation , 527–534, Granada, Spain.
Guarino, Nicola, and Chris Welty
2011 “Ontological Foundations for Conceptual Part-Wholes Relation: The Case of Collectives and Their Parts.” In Advanced Information Systems Engineering: Proceedings of the 23rd International CAiSE , (London, UK, June 20-24) Lecture Notes in Computer Science, 6741, 138–153. Berlin and Heidelberg, Germany: Springer-Verlag.
International Standards Organization (ISO) 2788
1986 “Guidelines for the Establishment and Development of Monolingual Thesauri.”
JEL Classification System
http://www.aeaweb.org/econlit/jelCodes.php. Accessed 26 August, 2015.
Kumar, Anand, and Barry Smith
2004 “The Ontology of Blood Pressure: A Case Study in Creating Ontological Partitions in Biomedicine.” (http://ontology.buffalo.edu/medo/BPO.pdf). Accessed 26 August 2015.
Legislative Indexing Vocabulary (LIV)
Loukachevitch, Natalia, and Boris Dobrov
2004 “Sociopolitical Domain as a Bridge from General Words to Terms of Specific Domains.” In Proceedings of the Second International WordNet Conference GWC-2004 , 163–168, Brno, Czech Republic.
2005 “Large-Scale Linguistic Ontology as a Basis for Text Categorization of Legislative Documents.” In Legal Knowledge and Information Systems: Jurix 2005, the Eighteenth Annual Conference , Vol. 134, 109–110. Amsterdam: IOS Press.
2014 “RuThes Linguistic Ontology vs. Russian Wordnets.” In Proceedings of the Seventh Global WordNet Conference (GWC 2014) , 154–162, Tartu, Estonia.
Lüngen, Harald, Claudia Kunze, Lothar Lemnitzer, and Angelika Storrer
2008 “Towards an Integrated OWL Model for Domain-Specific and General Language Wordnets.” In Proceedings of the 4th Global Wordnet Conference (GWC-2008) , 281–296, Szeged, Hungary.
Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schütze
Masolo, Claudio, Stefano Borgo, Aldo Gangemi, Nicola Guarino, and Alessandro Oltramari
2003 “WonderWeb Deliverable D18: Ontology Library (final).” Technical Report. Trento, Italy: Laboratory for Applied Ontology-ISTCCNR.
Meyer, Ingrid, and Kristen Mackintosh
Miles, Alistair, and Sean Bechhofer
2009 “SKOS Simple Knowledge Organization System Reference.” In W3C Recommendation 18 August. W3C Technical Reports Index.
Motschnig-Pitrik, Renate, and Jens Kaasboll
2005 “Guidelines for the Construction, Format and Management of Monolingual Thesauri.” ANSI/NISO Z39.19. Bethesda, MD: NISO Press.
Nase, Annemarie, and Robert Mdivani
1996 “Creating a multilingual thesaurus for the social sciences : linguistic and intercultural problems.” Social sciences in transition: social science information needs and provision in a changing Europe: proceedings of a European conference in Berlin, 349–356, Berlin, Germany.
Nazarenko, Adeline, and Haifa Zargayouna
2009 “Evaluating Term Extraction.” In Proceedings of International Conference Recent Advances in Natural Language Processing (RANLP’09) , 299–304, Hissar, Bulgaria.
Rambler news service
http://news.rambler.ru/. Accessed 26 August, 2015.
Rose, Tony, Mark Stevenson, and Miles Whitehead
2002 “The Reuters Corpus Volume 1 from Yesterday’s News to Tomorrow’s Language Resources.” In Proceedings of the Language Resources and Evaluation Conference (LREC 2002) , 827–832, Canary Islands, Spain.
Soergel, Dagobert, Boris Lauser, Anita Liang, Frehivot Fisseha, Johannes Keizer, and Stephen Katz
2004 “Reengineering Thesauri for New Applications: The AGROVOC Example.” Journal of Digital Information Article 4 (4). https://journals.tdl.org/jodi/index.php/jodi/article/view/112/111. Accessed 26 August 2015.
2009 “Guidelines for Analysis of UN documents and Publications.” UNBISnet–United Nations Dag Hammarskjöld Library (http://www.un.org/Depts/dhl/unbisref_manual/indexpolicy/guidelines.htm). Accessed 26 August 2015.
Veale, Tony, and Yanfen Hao
Weller, Marion, Anita Gojun, Ulrich Heid, Beatrice Daille, and Rima Harastani, R.
2011 “Simple Methods for Dealing with Term Variation and Term Alignment.” In Proceedings of the 9th International Conference on Terminology and Artificial Intelligence (TIA 2011) , 87–93, Paris.