Frequency, Dispersion, Association, and Keyness

Revising and tupleizing corpus-linguistic measures

HardboundAvailable
ISBN 9789027214928 | EUR 115.00 | USD 149.00
 
e-Book
ISBN 9789027246813 | EUR 115.00 | USD 149.00
 
Google Play logo
This book is an attempt to revisit the main specifically corpus-linguistic statistics/measures the field has been relying on for decades: frequency, dispersion, association, and keyness. The book first discusses the purpose of these measures and how they have been measured. Then, the book makes three main proposals: First, that many measures of dispersion, association, and keyness are too confounded with frequency and how to 'take frequency out of them' to obtain conceptually cleaner and more interpretable measures. Second, that many existing measures can be replaced by the simple information-theoretic measure of the Kullback-Leibler divergence and that it, too, can have frequency 'removed' from it. Third, that corpus linguistics should abandon the tradition of trying to describe its findings with a single number and adopt a tupleization approach instead, where we use several separate dimensions of information for description and interpretation. The book is written in an informal, hands-on style and comes with its own R package featuring functions, example data, and several thousand lines of code exemplifying all applications.
[Studies in Corpus Linguistics, 115] 2024.  vii, 321 pp.
Publishing status: Available
Published online on 19 June 2024
Table of Contents
Cited by (2)

Cited by two other publications

LI, Jingjie & Wenjie HU
2024. Identification of Sentence Stems Characteristic of Chinese Learner English Writing. Heliyon  pp. e37166 ff. DOI logo
Liao, Shengyu, Stefan Th. Gries & Stefanie Wulff
2024. Transfer five ways: applications of multiple distinctive collexeme analysis to the dative alternation in Mandarin Chinese. Corpus Linguistics and Linguistic Theory DOI logo

This list is based on CrossRef data as of 23 september 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.

Subjects

Main BIC Subject

CFX: Computational linguistics

Main BISAC Subject

LAN009000: LANGUAGE ARTS & DISCIPLINES / Linguistics / General
ONIX Metadata
ONIX 2.1
ONIX 3.0
U.S. Library of Congress Control Number:  2024023008 | Marc record