Chapter 6
Macrosyntactic corpus annotation
The case of Zaar
This paper argues for a minimal annotation representing in a simple and concise way the interface between information structure and syntax. The article uses the concept of macrosyntax, based on illocutionary units, for a new level of annotation using existing morphosyntactic tiers in Elan. One of the main assets of this system of annotation lies in the notion of piles it uses to represent the oral discursive flow and account for dysfluencies, discontinuities and ellipses. A pilot 15,000 words corpus has been annotated in Elan to run a preliminary study of the information structure of illocutionary components in Zaar, a Chadic language spoken in Nigeria. Their micro- and macro-syntactic properties are represented using Universal Dependencies Grammar.
Article outline
- 1.Introduction
- 2.Zaar and the Zaar corpus
- 3.Oral corpora and macrosyntax
- 3.1Dysfluencies
- 3.2Afterthoughts
- 3.3Syntactic relations over turn-taking
- 3.4Parallel constructions
- 4.Macrosyntactic corpus annotation
- 4.1Illocutionary Units and basic Illocutionary Components
- 4.1.1Nuclei
- 4.1.2Pre- and post-nuclei
- 4.2IlU introducers
- 4.3Associated Illocutionary Units
- 4.4Piling
- 4.5Non-alignment of Illocutionary Components and Governing
- 4.5.1Piling across Intonation Unit boundaries
- 4.5.2Piling across turn-taking
- 4.5.3Left-dislocated circumstantial adjuncts
- 5.Left-dislocation and marked identifying clauses in Zaar
- 5.1Topics
- 5.2Marked identifying clauses in Zaar
- 5.3The prosody of topic and identifying clauses in Zaar
- 5.4Syntactic representation
- 5.4.1Topic
- 5.4.2Marked Identifying Clause (IC2)
- 6.Macrosyntax and Information Structure annotation in Elan
- 7.Typology of pre- and post-nuclei
- 7.1Aligned peripheries
- 7.1.1Topic
- 7.1.2Dialogical constituents
- 7.1.2.1Phatic
- 7.1.2.2Allocutive
- 7.1.2.3Expressive
- 7.1.2.4Connective
- 7.2Non-aligned peripheries
- 7.2.1Pre-Nucleus (<+)
- 7.2.2Post-Nucleus (>+)
- 8.Conclusion
-
Notes
-
Abbreviations and special symbols
-
References