Edited by Tony Berber Sardinha and Marcia Veirano Pinto
[Studies in Corpus Linguistics 60] 2014
► pp. 149–176
This chapter reports on an MD analysis of North American and British pop songs from 1940 to 2009, comprising 6,290 individual songs from 32 artists and bands, representing 16 different music styles or genres. The corpus was automatically tagged for part of speech and semantic field. In addition, multi-word units in each song were identified and checked both against the Google 1 trillion word 3-gram corpus and against the whole song corpus in order to measure the use of formulaic language. The principal component analysis showed two sets of three factors, one for lexico-grammar and one for semantics. These were interpreted as dimensions, and the most representative songs, artists, styles, and time periods for each dimension were identified. Overall, this study advocates the relevance of song lyrics as an object of linguistic investigation.