Chapter 4. Methods of data collection and processing

Part of

Chapter 4
Methods of data collection and processing

Article outline

4.1Learner variables
- 4.1.1Learner profile questionnaire
- 4.1.2Psychometric tests
- 4.1.3Other variables
4.2Corpus data
- 4.2.1The Secondary-Level Corpus of Learner English (SCooLE)
  - 4.2.1.2Data elicitation
  - 4.2.1.1Linguistic annotation and header information
    - 4.2.1.2.1Merging of learner text and data on learner variables
    - 4.2.1.2.2Normalisation of accents/apostrophes
    - 4.2.1.2.3VARD-based normalisation of deviances
    - 4.2.1.2.4Manual normalisation of (virtual) homophones
    - 4.2.1.2.5Manual annotation of passives
    - 4.2.1.2.6Merging of TreeTagger and CLAWS annotations
    - 4.2.1.2.7Encoding for corpus query: Corpus Workbench (CWB)
- 4.2.2Reference language varieties
  - 4.2.2.1The Teaching Materials Corpus (TeaMC)
    - 4.2.2.1.1Choice of teaching materials
      - 4.2.2.1.1.1EFL materials: Year 7–10
      - 4.2.2.1.1.2EFL materials: Year 11/12
      - 4.2.2.1.1.3CLIL materials: Year 7–10
    - 4.2.2.1.2Linguistic annotation and header information
  - 4.2.2.2The Louvain Corpus of Native English Essays (LOCNESS)
    - 4.2.2.2.1Choice of subcorpus
    - 4.2.2.2.2Linguistic annotation and header information
4.3Experimental task
4.4Chapter summary
Notes