Chapter 6
Macrosyntactic corpus annotation
The case of Zaar
This paper argues for a minimal annotation representing in a simple and concise way the interface between information structure and syntax. The article uses the concept of macrosyntax, based on illocutionary units, for a new level of annotation using existing morphosyntactic tiers in Elan. One of the main assets of this system of annotation lies in the notion of piles it uses to represent the oral discursive flow and account for dysfluencies, discontinuities and ellipses. A pilot 15,000 words corpus has been annotated in Elan to run a preliminary study of the information structure of illocutionary components in Zaar, a Chadic language spoken in Nigeria. Their micro- and macro-syntactic properties are represented using Universal Dependencies Grammar.
Article outline
- 1.Introduction
- 2.Zaar and the Zaar corpus
- 3.Oral corpora and macrosyntax
- 3.1Dysfluencies
- 3.2Afterthoughts
- 3.3Syntactic relations over turn-taking
- 3.4Parallel constructions
- 4.Macrosyntactic corpus annotation
- 4.1Illocutionary Units and basic Illocutionary Components
- 4.1.1Nuclei
- 4.1.2Pre- and post-nuclei
- 4.2IlU introducers
- 4.3Associated Illocutionary Units
- 4.4Piling
- 4.5Non-alignment of Illocutionary Components and Governing
- 4.5.1Piling across Intonation Unit boundaries
- 4.5.2Piling across turn-taking
- 4.5.3Left-dislocated circumstantial adjuncts
- 5.Left-dislocation and marked identifying clauses in Zaar
- 5.1Topics
- 5.2Marked identifying clauses in Zaar
- 5.3The prosody of topic and identifying clauses in Zaar
- 5.4Syntactic representation
- 5.4.1Topic
- 5.4.2Marked Identifying Clause (IC2)
- 6.Macrosyntax and Information Structure annotation in Elan
- 7.Typology of pre- and post-nuclei
- 7.1Aligned peripheries
- 7.1.1Topic
- 7.1.2Dialogical constituents
- 7.1.2.1Phatic
- 7.1.2.2Allocutive
- 7.1.2.3Expressive
- 7.1.2.4Connective
- 7.2Non-aligned peripheries
- 7.2.1Pre-Nucleus (<+)
- 7.2.2Post-Nucleus (>+)
- 8.Conclusion
-
Notes
-
Abbreviations and special symbols
-
References
References (27)
References
Blanche-Benveniste, Claire, Bilger, Mireille, Rouget, Christine, van den Eynde, Karel & Mertens, Piet. 1990. Le français parlé: Études grammaticales. Paris: CNRS.
Caron, Bernard. 2005. Za:r (Dictionary, Grammar, Texts). Ibadan: IFRA.
Caron, Bernard. 2015a. Zaar grammatical sketch. In Mettouchi, Vanhove & Caubet (eds).
Caron, Bernard. 2015b. Tone and Intonation. In Mettouchi, Vanhove & Caubet (eds), 43–60.
Caron, Bernard, Lux, Cécile, Manfredi, Stefano & Pereira, Christophe. 2015. The intonation of topic and focus: Zaar (Nigeria), Tamasheq (Niger), Juba Arabic (South Sudan) and Tripoli (Libya). In Mettouchi, Vanhove & Caubet (eds), 63–115.
Chafe, Wallace L. 1976. Givenness, contrastiveness, definiteness, subjects, topics, and point of view. In Subject and Topic, Charles N. Li & Sandra A. Thompson (eds), 25–56. New York NY: Academic Press.
Chanard, Christian. 2014. ELAN-CorpA-V4.7.3. <[URL]>
Cresti, Emanuela. 2012. The definition of focus in Language into Act Theory (LACT). In Pragmatics and Prosody: Illocution, Modality, Attitude, Information Patterning and Speech Annotation, Heliana Mello, Alessandro Panunzi & Tommaso Raso (eds), 39–82. Florence: Firenze University Press.
Gerdes, Kim. 2013. Collaborative dependency annotation. In Proceedings of the Second International Conference on Dependency Linguistics (DepLing 2013), Eva Hajičová, Kim Gerdes & Leo Wanner, 88–97. Prague: Matfyzpress.
Halliday, Michael A. K. 1967. Notes on transitivity and theme in English, Part 2. Journal of Linguistics 3(2): 199–244.
Higgins, Francis Roger. 1973. The pseudo-cleft construction in English. PhD dissertation, MIT. <[URL]> (14 April 2016).
Huddleston, Rodney & Pullum, Geoffrey K. 2008. The Cambridge Grammar of the English Language, 2nd edn. Cambridge: CUP.
Kahane, Sylvain & Pietrandrea, Paola. 2012. La typologie des entassements en français, Vol. 1. SHS Web of Conferences. <[URL]>
Krifka, Manfred & Musan, Renate. 2012. Information structure: Overview and linguistic issues. In The Expression of Information Structure, Manfred Krifka & Renate Musan (eds), 1–43. Berlin: De Gruyter Mouton. <[URL]> (2 May 2016).
Lacheret, Anne, Pietrandrea, Paola & Tchobanov, Atanas. 2014. Rhapsodie: A Prosodic-Syntactic Treebank for Spoken French. <[URL]> (23 March 2016).
de Marneffe, Marie-Catherine, Dozat, Timothy, Silveira, Natalia, Haverinen, Katri, Ginter, Filip, Nivre, Joachim & Manning, Christopher D. 2014. Universal Stanford Dependencies: A cross-linguistic typology. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk & Stelios Piperidis (eds). Reykjavik: European Language Resources Association (ELRA).
de Marneffe, Marie-Catherine, Ginter, Filip, Goldberg, Yoav, Hajič, Jan, Manning, Christopher D., McDonald, Ryan, Nivre, Joakim et al. 2014. Universal Dependencies. Online documentation (Version 1). <[URL]> (16 June 2016).
Mettouchi, Amina, Vanhove, Martine & Caubet, Dominique (eds). 2012. The CorpAfroAs Corpus. ANR CorpAfroAs: A Corpus for Afro-Asiatic languages. <[URL]>
Mettouchi, Amina, Vanhove, Martine & Caubet, Dominique (eds). 2015. Corpus-based Studies of Lesser-described Languages. The CorpAfroAs Corpus of Spoken AfroAsiatic Languages [Studies in Corpus Linguistics 68]. Amsterdam: John Benjamins.
Newman, Paul. 1990. Nominal and Verbal Plurality in Chadic. Berlin: Walter de Gruyter.
Newman, Paul. 2006. Comparative Chadic revisited. In West African Linguistics: Papers in Honor of Russell G. Schuh [Studies in African Linguistics: Supplements 11], Paul Newman & Larry M. Hyman (eds), 188–202. Columbus OH: Published by the Department of Linguistics and the Center for African Studies, Ohio State University.
Newman, Paul. 2013. The Chadic Family: Classification and Name Index. (Electronic Publication). Mega-Chad Research Network / Réseau Méga-Tchad. <[URL]> (23 February 2014).
Schultze-Berndt, Eva. 2013. About the shifty notion of contrast. Identifying subtypes of topics in corpus data of two Australian languages. Oral presentation at
Labex TCA-ISGR, Llacan
(12/11/2013), Villejuif.
Shimizu, Kiyoshi. 1978. The Southern Bauchi Group of Chadic Languages: A Survey Report [Africana Marburgensia Special Issue 2]. Marburg.
Simard, Candide. 2014. Another look at right-detached NPs. In Proceedings of Conference on Language Documentation and Linguistic Theory 4, Aicha Belkadi, Kakia Chatsiou & Kirsty Rowan (eds). London: SOAS. <[URL]>
Sperber, Dan & Wilson, Deirdre. 1986. Relevance: Communication and Cognition. Oxford: Blackwell.
Cited by (1)
Cited by one other publication
[no author supplied]
2023. ,
This list is based on CrossRef data as of 26 september 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.