ISBN-13: 9783330001404 / Angielski / Miękka / 2016 / 328 str.
Modern industrial, government, and academic organizations are collecting massive amounts of data at an unprecedented scale and pace. The ability to perform timely and cost-effective analytical processing of such large datasets in order to extract deep insights is now a key ingredient for success. Existing database systems are adapting to the new status quo while large-scale dataflow systems like MapReduce are becoming popular for executing analytical workloads on Big Data. In order to ensure good and robust performance automatically on such systems, a novel dynamic optimization approach has been developed that works across different tuning scenarios and systems. The solution is based on (i) collecting monitoring information in order to learn the run-time behavior of workloads, (ii) deploying appropriate models to predict the impact of hypothetical tuning choices on workload behavior, and (iii) using efficient search strategies to find tuning choices that give good workload performance. The dynamic nature enables this solution to overcome the new challenges posed by Big Data, and also makes it applicable to both MapReduce and Database systems.