Article published in:
Exploring Newspaper Language: Using the web to create and investigate a large corpus of modern NorwegianEdited by Gisle Andersen
[Studies in Corpus Linguistics 49] 2012
► pp. 1–28
Building a large corpus based on newspapers from the web
Gisle Andersen | NHH Norwegian School of Economics
Knut Hofland | Uni Computing
The Norwegian Newspaper Corpus (NNC) is an initiative to create a large monitor corpus representing contemporary Norwegian language in both its written varieties, Bokmål and Nynorsk. The corpus is compiled through daily harvesting and processing of published texts from the web edition of Norwegian newspapers. This introductory chapter gives a survey of work on corpus building, tool development and research in connection with the NNC project. It provides an overview of the corpus and its system architecture, describing the work flow, tools and methods used in the data processing. The chapter also gives a presentation of the individual research contributions to this volume.
Published online: 23 March 2012
https://doi.org/10.1075/scl.49.01and
https://doi.org/10.1075/scl.49.01and
Cited by
Cited by 2 other publications
Andersen, Gisle
Kristiansen, Marita
This list is based on CrossRef data as of 17 may 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.