ISBN-13: 9783846548943 / Angielski / Miękka / 2011 / 256 str.
ISBN-13: 9783846548943 / Angielski / Miękka / 2011 / 256 str.
Measuring similarity is an important first step in numerous tasks in natural language processing. This book investigates two aspects of similarity: attributional similarity and relational similarity. Numerous methods to measure both those types of similarities using the text data available on the World Wide Web are presented in this book. In addition to describing theoretical work on similarity measures, the book demonstrates two interesting application tasks: person name disambiguation and name alias detection. This book can be used as an introductory reading material for students who are interested in conducting research in Web data mining or Web-based statistical natural language processing. The book should also be useful as a comprehensive reference for practitioners in Web data mining because it summarizes numerous similarity measures proposed for the purpose of measuring attributional and relational similarity between word pairs on the Web.