EU phraseological verbal patterns in the PETIMOD 2.0 corpus
A NER-enhanced approach
Texts from the European Union exhibit a high degree of formulaicity (Biel 2014). This chapter will study phraseological patterns in PETIMOD 2.0, an
English<>Spanish intermodal corpus of the EU Committee on Petitions. The first part briefly overviews the
corpus-based research on EU institutional phraseology, with a focus on contrastive approaches and parliamentary
corpora. The second part studies the formulaicity of named entities and their verbal patterns in PETIMOD 2.0. We
hypothesize that corpus-based Named Entity Recognition (NER) is the most suitable method to extract relevant
argument-structure constructions from such texts. Results shed light on the existence of different degrees of
formulaicity across languages and modes, but also on common features motivated by the pragmatics of the Petitions
Committee.
Article outline
- 1.Introduction
- 2.Related work
- 3.Study goals and methodology
- 3.1Units of analysis
- 3.2Choice of corpus
- 3.2.1Corpus size
- 3.2.2Transcription conventions and revisions
- 3.3Named entity recognition
- 3.3.1Extraction of entities and system performance
- 3.3.2Phraseological Pattern Extraction for NEs
- 4.Results and discussion
- 4.1Distribution of NEs
- 4.2Text-organizing patterns
- 4.3Grammatical patterns
- 4.4Term-embedding collocations
- 5.Conclusion
-
Notes
-
References