Natural Language Processing for Online Applications

Text retrieval, extraction and categorization

Second revised edition

| Thomson Corporation
| Thomson Corporation
ISBN 9789027249920 | EUR 105.00 | USD 158.00
ISBN 9789027249937 | EUR 33.00 | USD 49.95
ISBN 9789027292445 | EUR 105.00/33.00*
| USD 158.00/49.95*
This text covers the technologies of document retrieval, information extraction, and text categorization in a way which highlights commonalities in terms of both general principles and practical concerns. It assumes some mathematical background on the part of the reader, but the chapters typically begin with a non-mathematical account of the key issues. Current research topics are covered only to the extent that they are informing current applications; detailed coverage of longer term research and more theoretical treatments should be sought elsewhere. There are many pointers at the ends of the chapters that the reader can follow to explore the literature. However, the book does maintain a strong emphasis on evaluation in every chapter both in terms of methodology and the results of controlled experimentation.

This title replaces Natural Language Processing for Online Applications: Text retrieval, extraction and categorization (2002)

[Natural Language Processing, 5]  2007.  x, 232 pp.
Publishing status: Available
Table of Contents
Cited by

Cited by 64 other publications

No author info given
2011.  In Data Mining,  pp. 510 ff. Crossref logo
No author info given
2019.  In Data Mining,  pp. 607 ff. Crossref logo
Aboalnaser, Sara A.
2019.  In 2019 12th International Conference on Developments in eSystems Engineering (DeSE),  pp. 290 ff. Crossref logo
Ansari, Md Tarique Jamal & Naseem Ahmad Khan
2021. Worldwide COVID-19 Vaccines Sentiment Analysis Through Twitter Content. Electronic Journal of General Medicine 18:6  pp. em329 ff. Crossref logo
Anzalone, Salvatore M., Yuichiro Yoshikawa, Hiroshi Ishiguro, Emanuele Menegatti, Enrico Pagello & Rosario Sorbello
2012.  In Simulation, Modeling, and Programming for Autonomous Robots [Lecture Notes in Computer Science, 7628],  pp. 4 ff. Crossref logo
Anzalone, Salvatore Maria, Y. Yoshikawa, Hiroshi Ishiguro, Emanuele Menegatti, Enrico Pagello & Rosario Sorbello
2013.  In Intelligent Autonomous Systems 12 [Advances in Intelligent Systems and Computing, 194],  pp. 383 ff. Crossref logo
Arora, Chetan, Mehrdad Sabetzadeh, Shiva Nejati & Lionel Briand
2019. An Active Learning Approach for Improving the Accuracy of Automated Domain Model Extraction. ACM Transactions on Software Engineering and Methodology 28:1  pp. 1 ff. Crossref logo
Ashley, Kevin D. & Stefanie Brüninghaus
2009. Automatically classifying case texts and predicting outcomes. Artificial Intelligence and Law 17:2  pp. 125 ff. Crossref logo
Banchs, Rafael E. & Carlos G. Rodríguez Penagos
2013.  In Emerging Applications of Natural Language Processing,  pp. 230 ff. Crossref logo
Banchs, Rafael E. & Carlos G. Rodríguez Penagos
2013.  In Small and Medium Enterprises,  pp. 1945 ff. Crossref logo
Baraibar-Diez, Elisa, Manuel Luna, María D. Odriozola & Ignacio Llorente
2020. Mapping Social Impact: A Bibliometric Analysis. Sustainability 12:22  pp. 9389 ff. Crossref logo
Blackburn, Timothy D., Thomas A. Mazzuchi & Shahram Sarkani
2011. Overcoming Inherent Limits to Pharmaceutical Manufacturing Quality Performance with QbD (Quality by Design). Journal of Pharmaceutical Innovation 6:2  pp. 69 ff. Crossref logo
Bobicev, Victoria, Marina Sokolova, Khaled El Emam, Yasser Jafer, Brian Dewar, Elizabeth Jonker & Stan Matwin
2013. Can Anonymous Posters on Medical Forums be Reidentified?. Journal of Medical Internet Research 15:10  pp. e215 ff. Crossref logo
Bonino, Dario, Alberto Ciaramella & Fulvio Corno
2010. Review of the state-of-the-art in patent information and forthcoming evolutions in intelligent patent informatics. World Patent Information 32:1  pp. 30 ff. Crossref logo
Cahill, Maria, Soohyung Joo & Kathleen Campana
2018. Language investigations of children's information sources: A research agenda. Proceedings of the Association for Information Science and Technology 55:1  pp. 56 ff. Crossref logo
Cahill, Maria, Soohyung Joo & Kathleen Campana
2020. Analysis of language use in public library storytimes. Journal of Librarianship and Information Science 52:2  pp. 476 ff. Crossref logo
Canan Pembe, F. & Tunga Güngör
2009. Structure‐preserving and query‐biased document summarisation for web searching. Online Information Review 33:4  pp. 696 ff. Crossref logo
Carvalho, Joao P., Fernando Batista & Luisa Coheur
2012.  In 2012 IEEE International Conference on Fuzzy Systems,  pp. 1 ff. Crossref logo
Chantar, Hamouda, Majdi Mafarja, Hamad Alsawalqah, Ali Asghar Heidari, Ibrahim Aljarah & Hossam Faris
2020. Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification. Neural Computing and Applications 32:16  pp. 12201 ff. Crossref logo
Cheng, Li & Alei Liang
2013.  In Proceedings of 2013 3rd International Conference on Computer Science and Network Technology,  pp. 174 ff. Crossref logo
Chukharev-Hudilainen, Evgeny & Aysel Saricaoglu
2016. Causal discourse analyzer: improving automated feedback on academic ESL writing. Computer Assisted Language Learning 29:3  pp. 494 ff. Crossref logo
Cohen, K. Bretonnel & Lawrence Hunter
2008. Getting Started in Text Mining. PLoS Computational Biology 4:1  pp. e20 ff. Crossref logo
Csányi, Gergely & Tamás Orosz
2021. Comparison of data augmentation methods for legal document classification. Acta Technica Jaurinensis 15:1  pp. 15 ff. Crossref logo
Daniel, Gwendal & Jordi Cabot
2021.  In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion),  pp. 324 ff. Crossref logo
Daniel, Gwendal, Jordi Cabot, Laurent Deruelle & Mustapha Derras
2019.  In Advanced Information Systems Engineering [Lecture Notes in Computer Science, 11483],  pp. 177 ff. Crossref logo
Daniel, Gwendal, Jordi Cabot, Laurent Deruelle & Mustapha Derras
2020. Xatkit: A Multimodal Low-Code Chatbot Development Framework. IEEE Access 8  pp. 15332 ff. Crossref logo
Farrell, Treasa & Nick Rushby
2016. Assessment and learning technologies: An overview. British Journal of Educational Technology 47:1  pp. 106 ff. Crossref logo
Gardoň, Andrej & Aleš Horák
2011.  In Text, Speech and Dialogue [Lecture Notes in Computer Science, 6836],  pp. 323 ff. Crossref logo
Geist, Anton
2009. Using Citation Analysis Techniques for Computer-Assisted Legal Research in Continental Jurisdictions. SSRN Electronic Journal Crossref logo
Gibert, Marcin
2015.  In Computational Collective Intelligence [Lecture Notes in Computer Science, 9330],  pp. 648 ff. Crossref logo
Huijnen, Pim, Fons Laan, Maarten de Rijke & Toine Pieters
2014.  In Social Informatics [Lecture Notes in Computer Science, 8359],  pp. 71 ff. Crossref logo
Itahriouan, Zakaria, Nisserine El Bahri, Samir Brahim Belhaouari, Hajji Tarik & Mohamed Ouazzani Jamil
2021.  In Artificial Intelligence and Industrial Applications [Lecture Notes in Networks and Systems, 144],  pp. 110 ff. Crossref logo
Kang, Jingjing, Tao Liu, He Hu & Xiaoyong Du
2011.  In 2011 Sixth Annual Chinagrid Conference,  pp. 60 ff. Crossref logo
Kejriwal, Mayank, Daniel Gilley, Pedro Szekely & Jill Crisman
2018.  In Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18,  pp. 147 ff. Crossref logo
Krallinger, Martin, Obdulia Rabal, Anália Lourenço, Julen Oyarzabal & Alfonso Valencia
2017. Information Retrieval and Text Mining Technologies for Chemistry. Chemical Reviews 117:12  pp. 7673 ff. Crossref logo
Kucuk, Dilek & Adnan Yazici
2008.  In 2008 23rd International Symposium on Computer and Information Sciences,  pp. 1 ff. Crossref logo
Kusumadewi, Sri, Chanifah Indah Ratnasari & Linda Rosita
2015.  In 2015 International Conference on Science and Technology (TICST),  pp. 292 ff. Crossref logo
Küçük, Dilek & Adnan Yazıcı
2011. Exploiting information extraction techniques for automatic semantic video indexing with an application to Turkish news videos. Knowledge-Based Systems 24:6  pp. 844 ff. Crossref logo
Lai, Kaitao, Natalie Twine, Aidan O’Brien, Yi Guo & Denis Bauer
2019.  In Encyclopedia of Bioinformatics and Computational Biology,  pp. 272 ff. Crossref logo
Liszka, Kathy J., Chien-Chung Chan & Chandra Shekar
2012.  In Social Network Mining, Analysis, and Research Trends,  pp. 101 ff. Crossref logo
Liszka, Kathy J., Chien-Chung Chan & Chandra Shekar
2013.  In Data Mining,  pp. 1407 ff. Crossref logo
Lunn, Stephanie, Jia Zhu & Monique Ross
2020.  In 2020 IEEE Frontiers in Education Conference (FIE),  pp. 1 ff. Crossref logo
More, Joaquim, David Baneres, Jordi Conesa & Montse Junyent
2014.  In 2014 International Conference on Intelligent Networking and Collaborative Systems,  pp. 480 ff. Crossref logo
Oleshchuk, Vladimir & Vitaly Klyuev
2009.  In 2009 IEEE International Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications,  pp. 561 ff. Crossref logo
O’Shea, James, Zuhair Bandar & Keeley Crockett
2011.  In Intelligence-Based Systems Engineering [Intelligent Systems Reference Library, 10],  pp. 201 ff. Crossref logo
Pérez-Soler, Sara, Gwendal Daniel, Jordi Cabot, Esther Guerra & Juan de Lara
2020.  In Enterprise, Business-Process and Information Systems Modeling [Lecture Notes in Business Information Processing, 387],  pp. 257 ff. Crossref logo
Rebelo, Francisco, Carlos Soares & Rosaldo J. F. Rossetti
2015.  In 2015 IEEE First International Smart Cities Conference (ISC2),  pp. 1 ff. Crossref logo
Seki, Kazuhiro & Javed Mostafa
2008. Gene ontology annotation as text categorization: An empirical study. Information Processing & Management 44:5  pp. 1754 ff. Crossref logo
Shin, Teo Yon, Yuan Zihong, Ng Wee Siong, Zhang Yangfan & Valerie Phangt
2017.  In 2017 International Conference on Asian Language Processing (IALP),  pp. 99 ff. Crossref logo
Soni, Mukesh, S. Gomathi & Yagna Bhupendra Kumar Adhyaru
2020.  In 2020 7th International Conference on Smart Structures and Systems (ICSSS),  pp. 1 ff. Crossref logo
Stanković, Ranka, Cvetana Krstev, Ivan Obradović & Olivera Kitanović
2015.  In Semantic Keyword-Based Search on Structured Data Sources [Lecture Notes in Computer Science, 9398],  pp. 167 ff. Crossref logo
Stanković, Ranka, Cvetana Krstev, Ivan Obradović & Olivera Kitanović
2017.  In Transactions on Computational Collective Intelligence XXVI [Lecture Notes in Computer Science, 10190],  pp. 162 ff. Crossref logo
Sulieman, Lina, David Gilmore, Christi French, Robert M. Cronin, Gretchen Purcell Jackson, Matthew Russell & Daniel Fabbri
2017. Classifying patient portal messages using Convolutional Neural Networks. Journal of Biomedical Informatics 74  pp. 59 ff. Crossref logo
Sánchez-Cervantes, José Luis, Giner Alor-Hernández, Mario Andrés Paredes-Valverde, Lisbeth Rodríguez-Mazahua & Rafael Valencia-García
2021. NaLa-Search: A multimodal, interaction-based architecture for faceted search on linked open data. Journal of Information Science 47:6  pp. 753 ff. Crossref logo
Takemiya, Makoto, Kei Majima, Mitsuaki Tsukamoto & Yukiyasu Kamitani
2016. BrainLiner: A Neuroinformatics Platform for Sharing Time-Aligned Brain-Behavior Data. Frontiers in Neuroinformatics 10 Crossref logo
Talukder, Md Ashraful Islam, Sheikh Abujar, Abu Kaisar Mohammad Masum, Sharmin Akter & Syed Akhter Hossain
2020.  In 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT),  pp. 1 ff. Crossref logo
Thessen, Anne E., Cynthia Sims Parr & Luis M. Rocha
2014. Knowledge Extraction and Semantic Annotation of Text from the Encyclopedia of Life. PLoS ONE 9:3  pp. e89550 ff. Crossref logo
Tomašev, Nenad
2017. Extracting the patterns of truthfulness from political information systems in Serbia. Information Systems Frontiers 19:1  pp. 109 ff. Crossref logo
Vollero, Agostino, Domenico Sardanelli & Alfonso Siano
2021. Exploring the role of the Amazon effect on customer expectations: An analysis of user‐generated content in consumer electronics retailing. Journal of Consumer Behaviour Crossref logo
Vollero, Agostino, Alfonso Siano & Domenico Sardanelli
2020.  In Advances in Digital Marketing and eCommerce [Springer Proceedings in Business and Economics, ],  pp. 188 ff. Crossref logo
Yeshambel, Tilahun, Josiane Mothe & Yaregal Assabie
2022. Amharic Adhoc Information Retrieval System Based on Morphological Features. Applied Sciences 12:3  pp. 1294 ff. Crossref logo
Yoon, Sunmoo, Noémie Elhadad & Suzanne Bakken
2013. A Practical Approach for Content Mining of Tweets. American Journal of Preventive Medicine 45:1  pp. 122 ff. Crossref logo
Zhang, Lishan & Kurt VanLehn
2017. Adaptively selecting biology questions generated from a semantic network. Interactive Learning Environments 25:7  pp. 828 ff. Crossref logo
Zhao, Qianqian, Kai Chen, Tongxin Li, Yi Yang & XiaoFeng Wang
2018. Detecting telecommunication fraud by understanding the contents of a call. Cybersecurity 1:1 Crossref logo

This list is based on CrossRef data as of 28 july 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.



For errata please go to

Subjects & Metadata
BIC Subject: UYQL – Natural language & machine translation
BISAC Subject: COM042000 – COMPUTERS / Natural Language Processing
ONIX Metadata
ONIX 2.1
ONIX 3.0
U.S. Library of Congress Control Number:  2007010559 | Marc record