This book is the result of a group of researchers from different disciplines asking themselves one question: what does it take to develop a computer interface that listens, talks, and can answer questions in a domain? First, obviously, it takes specialized modules for speech recognition and synthesis, human interaction management (dialogue, input fusion, and multimodal output fusion), basic question understanding, and answer finding. While all modules are researched as independent subfields, this book describes the development of state-of-the-art modules and their integration into a...
This book is the result of a group of researchers from different disciplines asking themselves one question: what does it take to develop a compute...
The digital age has had a profound effect on our cultural heritage and the academic research that studies it. Staggering amounts of objects, many of them of a textual nature, are being digitised to make them more readily accessible to both experts and laypersons. Besides a vast potential for more effective and efficient preservation, management, and presentation, digitisation offers opportunities to work with cultural heritage data in ways that were never feasible or even imagined.
To explore and exploit these possibilities, an interdisciplinary approach is needed, bringing together...
The digital age has had a profound effect on our cultural heritage and the academic research that studies it. Staggering amounts of objects, many o...
The impact of computer systems that can understand natural language will be tremendous. To develop this capability we need to be able to automatically and efficiently analyze large amounts of text. Manually devised rules are not sufficient to provide coverage to handle the complex structure of natural language, necessitating systems that can automatically learn from examples. To handle the flexibility of natural language, it has become standard practice to use statistical models, which assign probabilities for example to the different meanings of a word or the plausibility of grammatical...
The impact of computer systems that can understand natural language will be tremendous. To develop this capability we need to be able to automatica...
Current language technology is dominated by approaches that either enumerate a large set of rules, or are focused on a large amount of manually labelled data. The creation of both is time-consuming and expensive, which is commonly thought to be the reason why automated natural language understanding has still not made its way into "real-life" applications yet.
This book sets an ambitious goal: to shift the development of language processing systems to a much more automated setting than previous works. A new approach is defined: what if computers analysed large samples of language...
Current language technology is dominated by approaches that either enumerate a large set of rules, or are focused on a large amount of manually lab...
Information extraction (IE) and text summarization (TS) are powerful technologies for finding relevant pieces of information in text and presenting them to the user in condensed form. The ongoing information explosion makes IE and TS critical for successful functioning within the information society.
These technologies face particular challenges due to the inherent multi-source nature of the information explosion. The technologies must now handle not isolated texts or individual narratives, but rather large-scale repositories and streams---in general, in multiple...
Information extraction (IE) and text summarization (TS) are powerful technologies for finding relevant pieces of information in text and presenting...
This book is the result of a group of researchers from different disciplines asking themselves one question: what does it take to develop a computer interface that listens, talks, and can answer questions in a domain? First, obviously, it takes specialized modules for speech recognition and synthesis, human interaction management (dialogue, input fusion, and multimodal output fusion), basic question understanding, and answer finding. While all modules are researched as independent subfields, this book describes the development of state-of-the-art modules and their integration into a...
This book is the result of a group of researchers from different disciplines asking themselves one question: what does it take to develop a compute...
Collaboratively Constructed Language Resources (CCLRs) such as Wikipedia, Wiktionary, Linked Open Data, and various resources developed using crowdsourcing techniques such as Games with a Purpose and Mechanical Turk have substantially contributed to the research in natural language processing (NLP). Various NLP tasks utilize such resources to substitute for or supplement conventional lexical semantic resources and linguistically annotated corpora. These resources also provide an extensive body of texts from which valuable knowledge is mined. There are an increasing number of community...
Collaboratively Constructed Language Resources (CCLRs) such as Wikipedia, Wiktionary, Linked Open Data, and various resources developed using crowd...
The digital age has had a profound effect on our cultural heritage and the academic research that studies it. Staggering amounts of objects, many of them of a textual nature, are being digitised to make them more readily accessible to both experts and laypersons. Besides a vast potential for more effective and efficient preservation, management, and presentation, digitisation offers opportunities to work with cultural heritage data in ways that were never feasible or even imagined.
To explore and exploit these possibilities, an interdisciplinary approach is needed, bringing together...
The digital age has had a profound effect on our cultural heritage and the academic research that studies it. Staggering amounts of objects, many o...
Research in Natural Language Processing (NLP) has rapidly advanced in recent years, resulting in exciting algorithms for sophisticated processing of text and speech in various languages. Much of this work focuses on English; in this book we address another group of interesting and challenging languages for NLP research: the Semitic languages. The Semitic group of languages includes Arabic (206 million native speakers), Amharic (27 million), Hebrew (7 million), Tigrinya (6.7 million), Syriac (1 million) and Maltese (419 thousand). Semitic languages exhibit unique morphological processes,...
Research in Natural Language Processing (NLP) has rapidly advanced in recent years, resulting in exciting algorithms for sophisticated processing o...
The impact of computer systems that can understand natural language will be tremendous. To develop this capability we need to be able to automatically and efficiently analyze large amounts of text. Manually devised rules are not sufficient to provide coverage to handle the complex structure of natural language, necessitating systems that can automatically learn from examples. To handle the flexibility of natural language, it has become standard practice to use statistical models, which assign probabilities for example to the different meanings of a word or the plausibility of grammatical...
The impact of computer systems that can understand natural language will be tremendous. To develop this capability we need to be able to automatica...