ISBN-13: 9786200244659 / Angielski
Due to digital advancements large volumes of data are being generated by the modern applications. In order to accurately categorize the data in these large datasets, clustering algorithms are used. This book presents a literature review of various traditional clustering algorithms and their comparisons from a theoretical perspective. The book also provides the survey of applications of clustering techniques on I) web log data, II) image data and III) biological data. One of the major drawbacks with the traditional clustering algorithms is that they are computationally expensive when the input datasize is too large. To overcome this problem, we also provide a comprehensive study of recent MapReduce based clustering algorithms which extend the traditional counterpart with Map-Reduce programming paradigm. Mainly this book is suitable for researchers who are interested in the field of pattern discovery from large datasets using MapReduce clustering. It will help them carrying out data clustering in distributed environment. More importantly, the issues and open areas discussed in this book will help the researchers in identifying their future direction.