A Proposal for Improving the Measurement of Parse Accuracy
Geoffrey Sampson | School of Cognitive and Computing Sciences, University of Sussex, England
Widespread dissatisfaction has been expressed with the measure of parse accuracy used in the Parseval programme, based on the location of constituent boundaries. Scores on the Parseval metric are perceived as poorly correlated with intuitive judgments of goodness of parse; the metric applies only to a restricted range of grammar formalisms; and it is seen as divorced from applications of NLP technology. The present paper defines an alternative metric, which measures the accuracy with which successive words are fitted into parsetrees. (The original statement of this metric is believed to have been the earliest published proposal about quantifying parse accuracy.) The metric defined here gives overall scores that quantify intuitive concepts of good and bad parsing relatively directly, and it gives scores for individual words which enable the location of parsing errors to be pinpointed. It applies to a wider range of grammar formalisms, and is tunable for specific parsing applications.
2023. Introduction to CKIP’s Language Resources and Their Applications. In Chinese Language Resources [Text, Speech and Language Technology, 49], ► pp. 27 ff.
Heeringa, Wilbert, Femke Swarte, Anja Schüppert & Charlotte Gooskens
2018. Measuring syntactical variation in Germanic texts. Digital Scholarship in the Humanities 33:2 ► pp. 279 ff.
Heeringa, Wilbert & Jelena Prokić
2017. Computational Dialectology. In The Handbook of Dialectology, ► pp. 330 ff.
Baisa, Vít & Vojtěch Kovář
2014. Information Extraction for Czech Based on Syntactic Analysis. In Human Language Technology Challenges for Computer Science and Linguistics [Lecture Notes in Computer Science, 8387], ► pp. 155 ff.
Baisa, Vít & Vojtěch Kovář
2014. Information Extraction for Czech Based on Syntactic Analysis. In Human Language Technology Challenges for Computer Science and Linguistics [Lecture Notes in Computer Science, 8387], ► pp. 155 ff.
Dégremont, Cédric, Antoine Venant & Nicholas Asher
2014. Semantic Similarity: Foundations. In New Frontiers in Artificial Intelligence [Lecture Notes in Computer Science, 8417], ► pp. 17 ff.
Jakubıček, Miloš & Vojtěch Kovář
2013. Enhancing Czech Parsing with Verb Valency Frames. In Computational Linguistics and Intelligent Text Processing [Lecture Notes in Computer Science, 7816], ► pp. 282 ff.
Wiersma, W., J. Nerbonne & T. Lauttamus
2011. Automatically Extracting Typical Syntactic Differences from Corpora. Literary and Linguistic Computing 26:1 ► pp. 107 ff.
SAMPSON, GEOFFREY & ANNA BABARCZY
2008. Definitional and human constraints on structural annotation of English. Natural Language Engineering 14:4 ► pp. 471 ff.
Higgins, Derrick
2007. International Conference on Semantic Computing (ICSC 2007), ► pp. 501 ff.
Horák, Aleš, Tomáš Holan, Vladimír Kadlec & Vojtěch Kovář
2007. Dependency and Phrasal Parsers of the Czech Language: A Comparison. In Text, Speech and Dialogue [Lecture Notes in Computer Science, 4629], ► pp. 76 ff.
Carroll, John, Guido Minnen & Ted Briscoe
2003. Parser Evaluation. In Treebanks [Text, Speech and Language Technology, 20], ► pp. 299 ff.
Montemagni, Simonetta, Francesco Barsotti, Marco Battista, Nicoletta Calzolari, Ornella Corazzari, Alessandro Lenci, Antonio Zampolli, Francesca Fanciulli, Maria Massetani, Remo Raffaelli, Roberto Basili, Maria Teresa Pazienza, Dario Saracino, Fabio Zanzotto, Nadia Mana, Fabio Pianesi & Rodolfo Delmonte
2003. Building the Italian Syntactic-Semantic Treebank. In Treebanks [Text, Speech and Language Technology, 20], ► pp. 189 ff.
This list is based on CrossRef data as of 4 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.