Measuring the degree of specialisation of sub-technical legal terms through corpus comparison
A domain-independent method
One of the most remarkable features of the legal English lexicon is the use of sub-technical vocabulary, that is, words frequently shared by the general and specialised fields which either retain a legal meaning in general English or acquire a specialised one in the legal context. As testing has shown, almost 50% of the terms extracted from BLaRC, an 8.85m word legal corpus, were found amongst the most frequent 2,000 word families of West’s (1953) GSL, Coxhead’s (2000) AWL or the BNC (2007), hence the relevance of this type of vocabulary in this English variety. Owing to their peculiar statistical behaviour in both contexts, it is particularly problematic to identify them and measure their termhood based on such parameters as their frequency or distribution in the general and specialised environments. This research proposes a novel termhood measuring method intended to objectively quantify this lexical phenomenon through the application of Williams’ (2001) lexical network model, which incorporates contextual information to compute the level of specialisation of sub-technical terms.
Keywords: Legal English, sub-technical terms, lexical networks, ESP, corpus linguistics
Published online: 19 May 2016
Ahmad, Khurshid, Andrea Davies, Heather Fulford, and Monika Rogers
1988 A Methodology for Automatic Term Recognition. PhD Thesis, University of Manchester, Institute of Science and Technology, United Kingdom.
Aronson, Alan, and Françoise-Michel Lang
Barrón-Cedeño, Alberto, Gerardo Sierra, Patrick Drouin, and Sofia Ananiadou
2009 “An Improved Automatic Term Recognition Method for Spanish.” In Proceedings of the 10th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2009), ed. by A. Gelbuck, 125–136. Berlin: Springer-Verlag. (http://users.dsic.upv.es/~lbarron/publications/2009/BarronTermsCICLING.pdf). Accessed January 2016.
1992 “Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases.” In Proceedings of the 5th International Conference on Computational Linguistics , 977–981. Nantes, France.
Cabré, María Teresa, Rosa Estopà, and Jorge Vivaldi
Chung, Teresa M.
Chung, Teresa M., and Paul Nation
Church, Kenneth W., and Patrick Hanks
Church, Kenneth W., and William Gale
Dagan, Ido, and Kenneth Church
1994 “TERMIGHT: Identifying and Translating Technical Terminology.” In Proceedings of the 4th Conference on Applied Natural Language Processing , 34–40. Stuttgart, Germany (http://www.aclweb.org/anthology-new/A/A94/A94-1006.pdf). Accessed January, 2016.
David, Sophie, and Pierre Plante
Fahmi, Ismail, Gosse Bouma, and Lonneke van der Plas
Frantzi, Katerina T., and Sophia Ananiadou
Frantzi, Katerina, Sofia Ananiadoua, and Hideki Mima
Geffet, Maayan, and Ido Dagan
2005 “The Distributional Inclusion Hypotheses and Lexical Entailment.” In Proceedings of the Annual Meeting of the ACL , 107–114. Michigan, USA.
Heatley, Alex, and Paul Nation
Joslyn, Cliff, Patrick Paulson, and Karin Verspoor
2008 “Exploiting Term Relations for Semantic Hierarchy Construction.” In Proceedings of the International Conference of Semantic Computing IEEE , 42–49. Santa Clara (CA), USA.
Justeson, John S., and Slava M. Katz
Kit, Chunyu, and Xiaoyue Liu
Lemay, Chantal, Marie-Claude L’Homme, and Patrick Drouin
Loginova, Elizabeta, Anita Gojun, Helena Blancafort, María Guegan, Tatiana Gornostay, and Ulrich Heid
Reference Lists for the Evaluation of Term Extraction Tools.” In Proceedings of TKE 2012: Terminology and Knowledge Engineering , 177–192. Madrid: Universidad Politécnica de Madrid. (http://www.ttc-project.eu/images/stories/TTC_TKE_2012.pdf), Accessed January 2016.
Marín, María José
Marín, María José, and Camino Rea
Maynard, Diana, and Sofia Ananiadou
Nakagawa, Hiroshi, and Tatsunori Mori
2002 “A Simple but Powerful Automatic Term Extraction Method.” In COLING-02 on COMPUTERM . Proceedings of the Second International Workshop on Computational Terminology , 1–7. Taipei, Taiwan.
Nazar, Rogelio, and María Teresa Cabré
2012 “Supervised Learning Algorithms Applied to Terminology Extraction.” In Proceedings of the 10th Terminology and Knowledge Engineering Conference TKE 2012, ed. by G. Aguado de Cea, M.C. Suárez-Figueroa, R. García-Castro, and E. Montiel-Ponsoda, 209–217. Madrid: Ontology Engineering Group, Association for Terminology and Knowledge Transfer.
Orts, María Ángeles
Panzienza, Maria Teresa, Marco Pennacchiotti, and Fabio Massimo Zanzotto
Park, Younja, Roy Byrd, and Branimir Boguraev
Sclano, Francesco, and Paola Velardi
2007 “A Web Application to Learn the Common Terminology of Interest Groups and Research Communities.” In Proceedings of the Conference TIA-2007, ed. by C. Engehard and R.D. Kuntz, 85–94. Grenoble: Presses Universitaires de Grenoble.
2001 Extracción de Candidatos a Término mediante Combinación de Estrategias Heterogéneas . PhD Thesis. Universidad Politécnica de Cataluña.
Vivaldi, Jorge, Diego Cabrera, Luis Adrián, Gerardo Sierra and María Pozzi
2012 “Using Wikipedia to Validate the Terminology Found in a Corpus of Basic Textbooks.” In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12) , 3820–3827. Instambul: Instambul Lütfi Kırdar Convention and Exhibition Centre. (http://www.lrec-conf.org/proceedings/lrec2012/index.html). Accessed January 2016.
Wang, Karen, and Paul Nation
Weeds, Julie, David Weir, and Diana McCarthy
2004 “Characterising Measures of Lexical Distributional Similarity.” In Proceedings of Coling-04 . 1–7, Geneva, Switzerland.