ISBN-13: 9783659444159 / Angielski / Miękka / 2013 / 164 str.
Blog classification is the system of classifying blogs based on pre-defined categories. This area is addressed by considering the textual content, or the surrounding features of the blogs. This study focuses on the textual content of the blogs which uses the blog title and posts for classification. These blogs are categorised and maintained as a blog directory that serves the demands of the users searching information online. Such online blog directories use human indexers to categorize blog pages. Manual classifications of blogs are tends to be labour intensive and time consuming. In related fields such as text mining and web mining, various classification methods such as supervised, semi-supervised and unsupervised methods were proposed. These studies have used Bag-of-Words representation of text documents, and indexes using term weighting scheme which does not capture the semantic relatedness. We devised a novel framework for automatic topic based blog classification, denoted as Terms-to-Concepts-to-Category framework.