Use of Domain Knowledge in Resolving Pronominal Anaphora
Abstract. The research reported here has been conducted in the context of the Plinius project, which aims at semi-automatic knowledge acquisition from short natural-language texts. In this framework, a system has been developed for finding the antecedents of pronominal anaphora, in particular 'it'- and 'its'- anaphora. The anaphora resolution module operates on parser output and can make use of information generated by the parser; the lexicon gives the conceptual representations corresponding to the words. The algorithm for anaphora resolution involves three steps: (i) Assemble: construct a list of discourse entities (DEs); (ii) Identify: identify anaphoric DEs; (iii) Select: select, for each anaphoric DE, another DE from the list of DEs as its antecedent. The third step applies four constraints, i.e. rules to which a DE must conform in order to be a valid candidate: (a) semantic type agreement; (b) number agreement; (c) projection constraint; (d) conceptual compatibility. Constraints (a, b, c) are linguistic, while (d) is domain-related. The algorithm has been tested on three texts. It turns out that applying (d) before (a, b, c) considerably improves efficiency.