ISBN-13: 9783659788475 / Angielski / Miękka / 2015 / 84 str.
Hadoop, the Apache Software Foundation's open source and Java-based implementation of the Map/Reduce framework, is a distributed computing framework designed for data-intensive distributed applications. It provides the tools for processing vast amounts of data using the Map/Reduce framework and, additionally, it implements a distributed file-system similar to Google's file system. It can be used to process vast amounts of data in-parallel on large clusters in a reliable and fault-tolerant fashion. For a long time Java is being used by many programmers for processing data. In this book we have compared and analyzed the performance of Hadoop with Java, Hadoop with Hadoop Optimize and Hadoop Optimize with Java in terms of different performance criterions, such as, processing (CPU utilization), storage and efficiency when they process data. Our experimental results show an improvement in execution time when using optimized Map/Reduce Algorithm. On comparison of Hadoop and Java, Hadoop is better when we have a multi node cluster and the data size is large. However, when we have a single node and small data size, even Java can perform better.
Hadoop, the Apache Software Foundations open source and Java-based implementation of the Map/Reduce framework, is a distributed computing framework designed for data-intensive distributed applications. It provides the tools for processing vast amounts of data using the Map/Reduce framework and, additionally, it implements a distributed file-system similar to Googles file system. It can be used to process vast amounts of data in-parallel on large clusters in a reliable and fault-tolerant fashion. For a long time Java is being used by many programmers for processing data. In this book we have compared and analyzed the performance of Hadoop with Java, Hadoop with Hadoop Optimize and Hadoop Optimize with Java in terms of different performance criterions, such as, processing (CPU utilization), storage and efficiency when they process data. Our experimental results show an improvement in execution time when using optimized Map/Reduce Algorithm. On comparison of Hadoop and Java, Hadoop is better when we have a multi node cluster and the data size is large. However, when we have a single node and small data size, even Java can perform better.