ISBN-13: 9780857291974 / Angielski / Twarda / 2011 / 297 str.
ISBN-13: 9780857291974 / Angielski / Twarda / 2011 / 297 str.
This text reviews the issues involved in handling and processing digital documents. Examining the full range of a document's lifetime, the book covers acquisition, representation, security, pre-processing, layout analysis, understanding, analysis of single components, information extraction, filing, indexing and retrieval. Features: provides a list of acronyms and a glossary of technical terms; contains appendices covering key concepts in machine learning, and providing a case study on building an intelligent system for digital document and library management; discusses issues of security, and legal aspects of digital documents; examines core issues of document image analysis, and image processing techniques of particular relevance to digitized documents; reviews the resources available for natural language processing, in addition to techniques of linguistic analysis for content handling; investigates methods for extracting and retrieving data/information from a document.