Sequence and word frequency
It is well-established that the linear ordering of words in a sentence is influenced by a variety of factors that are typically labelled as grammatical, discourse or cognitive constraints. The aim of the present study is to determine whether frequency effects are visible in the sequencing of words in a sentence. In other words, do “more frequently used units tend to be placed before less frequently used units” (
Fenk-Oczlon 2001: 443)? Using a corpus of newspaper articles, we examine the frequency of words in different positions in sentences. That is, using data from thousands of sentences, we investigate the median value for the frequency or rank of words in first position in a sentence, compared with second position, and so on. We find that there is a frequency effect in English: the first element in a sentence has the highest frequency and last element in a sentence has the lowest frequency, with the middle of sentences having a more or less flat frequency profile. We also find that the overall shape of the frequency profile for sentences is rather consistent even when sentence length is taken into account.
Article outline
- 1.Introduction
- 2.Method
- 3.Results and discussion
- 3.1Patterns in sentences
- 3.2Sentence length
- 3.3Phases of the sentence
- 3.4Variation among sentences
- 3.5Articles and other high frequency words
- 4.Conclusion
- Note
-
References