Recent Advances in Natural Language Processing III

Selected papers from RANLP 2003

Editors

Nicolas Nicolov | IBM T.J. Watson Research Center

Kalina Bontcheva | University of Sheffield

Galia Angelova | Bulgarian Academy of Sciences

Ruslan Mitkov | University of Wolverhampton

Hardbound – Available

ISBN 9789027247742 (Eur) | EUR 110.00

ISBN 9781588116185 (USA) | USD 165.00

e-Book –

ISBN 9789027294685 | EUR 110.00 | USD 165.00

Netlibrary e-Book – Not for resale

ISBN 9781423766469

This volume brings together revised versions of a selection of papers presented at the 2003 International Conference on “Recent Advances in Natural Language Processing”. A wide range of topics is covered in the volume: semantics, dialogue, summarization, anaphora resolution, shallow parsing, morphology, part-of-speech tagging, named entity, question answering, word sense disambiguation, information extraction. Various ‘state-of-the-art’ techniques are explored: finite state processing, machine learning (support vector machines, maximum entropy, decision trees, memory-based learning, inductive logic programming, transformation-based learning, perceptions), latent semantic analysis, constraint programming. The papers address different languages (Arabic, English, German, Slavic languages) and use different linguistic frameworks (HPSG, LFG, constraint-based DCG).
This book will be of interest to those who work in computational linguistics, corpus linguistics, human language technology, translation studies, cognitive science, psycholinguistics, artificial intelligence, and informatics.

[Current Issues in Linguistic Theory, 260] 2004. xii, 402 pp.

Publishing status: Available

© John Benjamins Publishing Company

https://doi.org/10.1075/cilt.260

Table of Contents

Editors’ Foreword | p. ix

I. Invited lectures

A type-theoretic approach to anaphora and ellipsis resolution

Chris Fox and Shalom Lappin | p. 1

Human dialogue modelling using machine learning

Yorick Wilks, Nick Webb, Andrea Setzer, Mark Hepple and Roberta Catizone | p. 17

Learning domain theories

Stephen G. Pulman and Maria Liakata | p. 29

Recent developments in temporal information extraction

Inderjeet Mani | p. 45

Annotation-based finite state processing in a large-scale NLP arhitecture

Branimir K. Boguraev | p. 61

II. Lexical semantics and lexical knowledge acquisition

Acquiring lexical paraphrases from a single corpus

Oren Glickman and Ido Dagan | p. 81

Multi-word collocation extraction by syntactic composition of collocation bigrams

Violeta Seretan, Luka Nerima and Eric Wehrli | p. 91

Combining independent modules in lexical multiple-choice problems

Peter D. Turney, Michael L. Littman, Jeffrey Bigham and Victor Shnayder | p. 101

Roget’s thesaurus and semantic similarity

Mario Jarmasz and Stan Szpakowicz | p. 111

Clustering WordNet word senses

Eneko Agirre and Oier Lopez de Lacalle | p. 121

Inducing hyperlinking rules in text collections

Roberto Basili, Maria Teresa Pazienza and Fabio Massimo Zanzotto | p. 131

Near-synonym choice in natural language generation

Diana Zaiu Inkpen and Graeme Hirst | p. 141

III. Tagging, parsing and syntax

Fast and accurate part-of-speech tagging: The SVM approach revisited

Jesús Giménez and Lluís Màrquez | p. 153

Part-of-speech tagging with minimal lexicalization

Virginia Savova and Leon Peshkin | p. 163

Accurate annotation: An efficiency metric

António Branco and João Silva | p. 173

Structured parameter estimation for LFG-DOP

Mary Hearne and Khalil Sima’an | p. 183

Parsing without grammar — Using complete trees instead

Sandra Kübler | p. 193

Phrase recognition by filtering and ranking with perceptrons

Xavier Carreras and Lluís Màrquez | p. 205

Cascaded finite-state partial parsing: A larger-first approach

Sebastian van Delden and Fernando Gomez | p. 217

A constraint-based bottom-up counterpart to definite clause grammars

Henning Christiansen | p. 227

IV. Information extraction

Using parallel texts to improve recall in botany

Mary McGee Wood, Susannah J. Lydon, Valentin Tablan, Diana Maynard and Hamish Cunningham | p. 237

Marking atomic events in sets of related texts

Elena Filatova and Vasileios Hatzivassiloglou | p. 247

Semantically driven approach for scenario recognition in the IE system FRET

Svetla Boytcheva, Milena Yankova and Albena Strupchanska | p. 257

A framework for named entity recognition in the open domain

Richard J. Evans | p. 267

V. TEXT SUMMARISATION AND DOCUMENT PROCESSING

Latent semantic analysis and the construction of coherent extracts

Tristan Miller | p. 277

Facilitating email thread access by extractive summary generation

Ani Nenkova and Amit Bagga | p. 287

Towards deeper understanding of the latent semantic analysis performance

Preslav Nakov, Elena Valchanova and Galia Angelova | p. 297

Automatic linking of similar texts across languages

Bruno Pouliquen, Ralf Steinberger and Camelia Ignat | p. 307

VI. OTHER NLP TOPICS

Verb phrase ellipsis detection using machine learning techniques

Leif Arda Nielsen | p. 317

HPSG-based annotation scheme for corpora development and parsing evaluation

Kiril Iv. Simov | p. 327

Arabic Morpho-syntax for Text-to-Speech

Allan Ramsay and Hanady Mansour | p. 337

Guessing morphological classes of unknown German nouns

Preslav Nakov, Yury Bonev, Galia Angelova, Evelyn Gius and Walther von Hahn | p. 347

Building sense tagged corpora with volunteer contributions over the Web

Rada Mihalcea and Timothy Chklovski | p. 357

Reducing false positives by expert combination in automatic keyword indexing

Anette Hulth | p. 367

Socrates: A question answering prototype for Bulgarian

Hristo T. Tanev | p. 377

Unsupervised natural language disambiguation using non-ambiguous words

Rada Mihalcea | p. 387

List of Contributors | p. 397

Subjects

Linguistics

Theoretical linguistics

Natural language processing

Main BIC Subject

CF: Linguistics

Main BISAC Subject

LAN009000: LANGUAGE ARTS & DISCIPLINES / Linguistics / General

ONIX Metadata

U.S. Library of Congress Control Number: 2004062362 | Marc record