A linear approach of chain composition
This corpus-based approach to coreference chains analyzes recurrences in the patterns of chains, providing new insights into conventions or preferences in the forms of referential expressions. By taking into account the linearity of discourse and the succession of mentions, it goes beyond the more commonly implemented analysis of global characteristics. We analyze 581 reference chains from the French corpus AnnoDis. Using clustering methods, we first show that the resulting clusters are linguistically interpretable. We then demonstrate that animacy and genre influence chain composition. Finally we identify the main patterns of coreference chains in the corpus. This highlights different types of chains and discourse strategies, which vary across genres, and confirms a major influence of referent type.
Article outline
- 1.Introduction
- 2.Application to the analysis of an annotated French corpus: The AnnoDis Corpus
- 3.A sequence analysis approach to coreference chains
- 3.1Parameters of the sequence analysis
- 3.2From mentions to states
- 4.Clustering coreference chains according to their sequence of mentions
- 5.Impact of animacy and text genre on chain composition
- 6.Patterns of coreference chains
- 7.Discussion
-
Notes
-
References