The description, automatic identification and further processing of web genres is a novel field of research in computational linguistics, NLP and related areas such as text-technology, digital humanities and web mining. One of the driving forces behind this research is the idea of genre-enabled search engines which enable users to additionally specify web genres that the documents to be retrieved should comply with (e.g., personal homepage, weblog, scientific article etc.). This book offers a thorough foundation of this upcoming field of research on web genres and document types in web-based...
The description, automatic identification and further processing of web genres is a novel field of research in computational linguistics, NLP and rela...
The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and...
The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and termi...
A Frequency Dictionary of Russian is an invaluable tool for all learners of Russian, providing a list of the 5,000 most frequently used words in the language and the 300 most frequent multiword constructions. The dictionary is based on data from a 150-million-word internet corpus taken from more than 75,000 webpages and covering a range of text types from news and journalistic articles, research papers, administrative texts and fiction. All entries in the rank frequency list feature the English equivalent, a sample sentence with English translation, a part of speech indication, indication of...
A Frequency Dictionary of Russian is an invaluable tool for all learners of Russian, providing a list of the 5,000 most frequently used words in the l...
The description, automatic identification and further processing of web genres is a novel field of research in computational linguistics, NLP and related areas such as text-technology, digital humanities and web mining. One of the driving forces behind this research is the idea of genre-enabled search engines which enable users to additionally specify web genres that the documents to be retrieved should comply with (e.g., personal homepage, weblog, scientific article etc.). This book offers a thorough foundation of this upcoming field of research on web genres and document types in web-based...
The description, automatic identification and further processing of web genres is a novel field of research in computational linguistics, NLP and rela...
Here is the first comprehensive resource on the use of comparable corpora in multilingual Natural Language Processing, which goes beyond such techniques as such as machine translation and terminology mining to utilize non-parallel texts in the same domain.
Here is the first comprehensive resource on the use of comparable corpora in multilingual Natural Language Processing, which goes beyond such techniqu...