This book describes new methodological and technological approaches to corpus building and presents recent research based on the "Norwegian Newspaper Corpus". This is a large monitor corpus of contemporary Norwegian language, compiled through daily harvesting of web newspapers. The book gives an overview of the corpus and its system architecture, and presents tools used for tasks such as text harvesting, annotation, topic classification and extraction and frequency profiling of new words and phrases. Among the innovative technologies is Corpuscle, a corpus query engine and management system...
This book describes new methodological and technological approaches to corpus building and presents recent research based on the "Norwegian Newspaper ...