Automatic term recognition and legal language
A shorter path to the lexical profiling of legal texts?
Natural Language Processing (NLP) tools offer language scholars a wide array of possibilities to
examine, amongst other, the lexicon in any text collection. This research was designed as an attempt to try to measure
the degree of precision of three of these methods (Chung 2003; Drouin 2003; Scott 2008a) through
their implementation on two corpora of Spanish and British judicial decisions which revolve around the topic of
immigration. In addition, the last section of this chapter explores the lexical inventories extracted by each method
(the top 500 candidate terms (CTs) in each case) by grouping them into ad hoc thematic categories,
the most numerous being, as was to be expected, legal terms, followed by territory,
evaluative items, crime and family.
Article outline
- 1.Introduction
- 2.ATR and legal language
- 3.Methodology
- 3.1Method description
- 3.1.1Keywords
- 3.1.2TermoStat
- 3.1.3Chung
- 3.2Corpus description
- 3.3Method implementation
- 4.Results and discussion
- 4.1Method validation
- 4.2Thematic term categories
- 4.2.1Corpus-driven semantic classification
- 4.2.2Semantic categorization using UMUTextStats
- 5.Conclusion
-
Notes
-
References