Exploring Newspaper Language
Using the web to create and investigate a large corpus of modern Norwegian
Editor
This book describes new methodological and technological approaches to corpus building and presents recent research based on the Norwegian Newspaper Corpus. This is a large monitor corpus of contemporary Norwegian language, compiled through daily harvesting of web newspapers. The book gives an overview of the corpus and its system architecture, and presents tools used for tasks such as text harvesting, annotation, topic classification and extraction and frequency profiling of new words and phrases. Among the innovative technologies is Corpuscle, a corpus query engine and management system which is flexible enough to handle very large corpora in an efficient way. The individual research contributions based on the corpus explore different aspects of Norwegian, including the occurrence of anglicisms, neologisms and terminology, and the use of metonymy and metaphor in newspaper language. The book also describes an innovative method of applying correspondence analysis and implicational analysis to investigate interdependencies between morphosyntactic variants.
[Studies in Corpus Linguistics, 49] 2012. vi, 356 pp.
Publishing status: Available
Published online on 1 March 2012
Published online on 1 March 2012
© John Benjamins Publishing Company
Table of Contents
-
Building a large corpus based on newspapers from the webGisle Andersen and Knut Hofland | pp. 1–28
-
Part I. Exploiting the web as a corpus – Methods and tools
-
Corpuscle – a new corpus management platform for annotated corporaPaul Meurer | pp. 29–50
-
OBT+stat: A combined rule-based and statistical taggerJanne Bondi Johannessen, Kristin Hagen, André Lynum and Anders Nøklestad | pp. 51–66
-
Exploring corpora through syntactic annotationVictoria Rosén | pp. 67–78
-
Collocations and statistical analysis of n-grams: Multiword expressions in newspaper textGunn Inger Lyse and Gisle Andersen | pp. 79–110
-
Automatic topic classification of a large newspaper corpusThomas M. Hagen | pp. 111–130
-
A data-driven approach to anglicism identification in NorwegianGyri Smørdal Losnegaard and Gunn Inger Lyse | pp. 131–154
-
Part II. Corpus-based case studies
-
A corpus-based study of the adaptation of English import words in NorwegianGisle Andersen | pp. 155–192
-
Norm clusters in written NorwegianHelge Dyvik | pp. 193–220
-
Lexical neography in modern NorwegianRuth Vatvedt Fjeld and Lars Nygaard | pp. 221–240
-
Ash compound frenzy: A case study in the Norwegian Newspaper CorpusKoenraad De Smedt | pp. 241–256
-
Financial jargon in a general newspaper corpusMarita Kristiansen | pp. 257–284
-
Metonymic extension and vagueness: Schengen and Kyoto in Norwegian newspaper languageSandra L. Halverson | pp. 285–306
-
Spatial metaphors in present-day Norwegian newspaper languageLeiv Egil Breivik and Toril Swan | pp. 307–330
-
Doing historical linguistics using contemporary dataØivin Andersen | pp. 331–350
-
| pp. 351–352
-
Subject index | pp. 353–356
Cited by (9)
Cited by nine other publications
Gajšt, Nataša
Gisle, Andersen
Matharaarachchi, Surani, Mike Domaratzki, Alan Katz & Saman Muthukumarana
Andersen, Gisle & Anne-Line Graedler
Claire Emma Birnie, Jennifer Sampson, Eivind Sjaastad, Bjarte Johansen, Lars Egil Obrestad, Ronny Larsen & Ahmed Khamassi
Abdumanapovna, Sharipova Aziza
Andersen, Gisle
2022. Utilising heterogeneous language resources for term extraction in maritime domains. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 28:1 ► pp. 1 ff.
This list is based on CrossRef data as of 23 september 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
Subjects
Main BIC Subject
CFX: Computational linguistics
Main BISAC Subject
LAN009000: LANGUAGE ARTS & DISCIPLINES / Linguistics / General