ISBN-13: 9786202072410 / Angielski / Miękka / 2017 / 72 str.
In the most recent years, with the massive expansion of the information society, the web has become a precious source of information for almost every potential domain of knowledge. This has induced many researches to initiate considering the web as a legitimate repository for Information Retrieval (IR) and knowledge acquisition tasks. The Web consists of the massive amount of information relating to every possible domain and its high redundancy, can be a valid knowledge source for similarity computation. Therefore, text mining systems faces with a huge amount of attributes. The knowledge discovery in database systems requires input texts to be represented as a set of attributes in order to deal with them. The text-to-representation method is known as text or document indexing, and the attributes are called indexes. Indexing becomes a critical task in text mining because it has to represent the information in the text with the minimum loss of semantics for its future usage.