A Proposal for Improving the Measurement of Parse Accuracy

Sampson, Geoffrey

doi:10.1075/ijcl.5.1.04sam

Article published In:

International Journal of Corpus Linguistics
Vol. 5:1 (2000) ► pp.53–68

A Proposal for Improving the Measurement of Parse Accuracy

Geoffrey Sampson | School of Cognitive and Computing Sciences, University of Sussex, England

Widespread dissatisfaction has been expressed with the measure of parse accuracy used in the Parseval programme, based on the location of constituent boundaries. Scores on the Parseval metric are perceived as poorly correlated with intuitive judgments of goodness of parse; the metric applies only to a restricted range of grammar formalisms; and it is seen as divorced from applications of NLP technology. The present paper defines an alternative metric, which measures the accuracy with which successive words are fitted into parsetrees. (The original statement of this metric is believed to have been the earliest published proposal about quantifying parse accuracy.) The metric defined here gives overall scores that quantify intuitive concepts of good and bad parsing relatively directly, and it gives scores for individual words which enable the location of parsing errors to be pinpointed. It applies to a wider range of grammar formalisms, and is tunable for specific parsing applications.

Keywords: leaf-ancestor assessment, parser evaluation, Parseval, parse-evaluation metric

Published online: 13 June 2001

https://doi.org/10.1075/ijcl.5.1.04sam

Cited by (13)

Cited by 13 other publications

Order by:

Gao, Zhao-Ming, Chu-Ren Huang & Keh-Jiann Chen

2023. Introduction to CKIP’s Language Resources and Their Applications. In Chinese Language Resources [Text, Speech and Language Technology, 49], ► pp. 27 ff.

Heeringa, Wilbert, Femke Swarte, Anja Schüppert & Charlotte Gooskens

2018. Measuring syntactical variation in Germanic texts. Digital Scholarship in the Humanities 33:2 ► pp. 279 ff.

Heeringa, Wilbert & Jelena Prokić

2017. Computational Dialectology. In The Handbook of Dialectology, ► pp. 330 ff.

Baisa, Vít & Vojtěch Kovář

2014. Information Extraction for Czech Based on Syntactic Analysis. In Human Language Technology Challenges for Computer Science and Linguistics [Lecture Notes in Computer Science, 8387], ► pp. 155 ff.

Baisa, Vít & Vojtěch Kovář

2014. Information Extraction for Czech Based on Syntactic Analysis. In Human Language Technology Challenges for Computer Science and Linguistics [Lecture Notes in Computer Science, 8387], ► pp. 155 ff.

Dégremont, Cédric, Antoine Venant & Nicholas Asher

2014. Semantic Similarity: Foundations. In New Frontiers in Artificial Intelligence [Lecture Notes in Computer Science, 8417], ► pp. 17 ff.

Jakubıček, Miloš & Vojtěch Kovář

2013. Enhancing Czech Parsing with Verb Valency Frames. In Computational Linguistics and Intelligent Text Processing [Lecture Notes in Computer Science, 7816], ► pp. 282 ff.

Wiersma, W., J. Nerbonne & T. Lauttamus

2011. Automatically Extracting Typical Syntactic Differences from Corpora. Literary and Linguistic Computing 26:1 ► pp. 107 ff.

SAMPSON, GEOFFREY & ANNA BABARCZY

2008. Definitional and human constraints on structural annotation of English. Natural Language Engineering 14:4 ► pp. 471 ff.

Higgins, Derrick

2007. International Conference on Semantic Computing (ICSC 2007), ► pp. 501 ff.

Horák, Aleš, Tomáš Holan, Vladimír Kadlec & Vojtěch Kovář

2007. Dependency and Phrasal Parsers of the Czech Language: A Comparison. In Text, Speech and Dialogue [Lecture Notes in Computer Science, 4629], ► pp. 76 ff.

Carroll, John, Guido Minnen & Ted Briscoe

2003. Parser Evaluation. In Treebanks [Text, Speech and Language Technology, 20], ► pp. 299 ff.

Montemagni, Simonetta, Francesco Barsotti, Marco Battista, Nicoletta Calzolari, Ornella Corazzari, Alessandro Lenci, Antonio Zampolli, Francesca Fanciulli, Maria Massetani, Remo Raffaelli, Roberto Basili, Maria Teresa Pazienza, Dario Saracino, Fabio Zanzotto, Nadia Mana, Fabio Pianesi & Rodolfo Delmonte

2003. Building the Italian Syntactic-Semantic Treebank. In Treebanks [Text, Speech and Language Technology, 20], ► pp. 189 ff.

This list is based on CrossRef data as of 4 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.