List of Figures ix
List of Tables xi
Preface xiii
Acknowledgments xv
Acronyms xvii
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Assumptions 3
1.3 For Whom is This Book? 4
1.4 Book Structure 4
Chapter 2 Evolution of IT Architectures and Paradigms 7
2.1 Evolution of IT Architectures 7
2.1.1 Monolith 7
2.1.2 Service Oriented Architecture 9
2.1.3 Microservices 12
2.2 Actors and Agents 15
2.2.1 Actors 15
2.2.2 Agents 17
2.3 From ACID to BASE, CAP, and NoSQL - The Database (R)evolution 22
2.4 The Cloud 24
2.5 From Distributed Sensor Networks to the Internet of Things and Cyber-Physical Systems 27
2.6 The Rise of Big Data 28
Chapter 3 Sources of Data 31
3.1 The Internet 32
3.1.1 The Semantic Web 32
3.1.2 Linked Data 35
3.1.3 Knowledge Graphs 36
3.1.4 Social Media 38
3.1.5 Web Mining 38
3.2 Scientific Data 40
3.2.1 Biomedical Data 40
3.2.2 Physics and Astrophysics Data 41
3.2.3 Environmental Sciences 44
3.3 Industrial Data 45
3.3.1 Smart Factories 45
3.3.2 SmartGrid 47
3.3.3 Aviation 47
3.4 Internet of Things 48
Chapter 4 Big Data Tasks 51
4.1 Recommender Systems 51
4.2 Search 52
4.3 Ad-tech and RTB Algorithms 55
4.4 Cross-Device Graph Generation 57
4.5 Forecasting and Prediction Systems 58
4.6 Social Media Big Data 59
4.7 Anomaly and Fraud Detection 61
4.8 New Drug Discovery 63
4.9 Smart Grid Control and Monitoring 64
4.10 IoT and Big Data Applications 65
Chapter 5 Cloud Computing 67
5.1 Cloud Enabled Architectures 67
5.1.1 Cloud Management Platforms 67
5.1.2 Efficient Cloud Computing 73
5.1.3 Distributed Storage Systems 75
5.2 Agents and the Cloud 82
5.2.1 Multi-agent Versus Cloud Paradigms 83
5.2.2 Agents in the Cloud 83
Chapter 6 Big Data Architectures 87
6.1 Big Data Computation Models 87
6.1.1 MapReduce 87
6.1.2 Directed Acyclic Graph Models 89
6.1.3 All-Pairs 92
6.1.4 Very Large Bitmap Operations 93
6.1.5 Message Passing Interface 94
6.1.6 Graphical Processing Unit Computing 95
6.2 Publish-Subscribe Systems 97
6.3 Stream Processing 99
6.3.1 Information Flow Processing Concepts 99
6.3.2 Stream Processing Systems 101
6.4 Higer Level Big Data Architectures 110
6.4.1 Spark 110
6.4.2 Lambda 112
6.4.3 Multi-Agent View of the Lambda Architecture 113
6.4.4 Questioning the Lambda 115
6.5 Industry and Other Approaches 116
6.6 Actor and Agent-Based Big Data Architectures 118
Chapter 7 Big Data Analytics, Mining, and Machine Learning 121
7.1 To SQL or Not to SQL 122
7.1.1 SQL Hadoop Interfaces 123
7.1.2 From Shark to SparkSQL 125
7.2 Big Data Mining and Machine Learning 128
7.2.1 Graph Mining 133
7.2.2 Agent Based Machine Learning and Data Mining 134
Chapter 8 Physically Distributed Systems - Mobile Cloud, Internet of Things, Edge Computing 137
8.1 Mobile Cloud 138
8.2 Edge and Fog Computing 145
8.2.1 Business Case: Mobile Context Aware Recommender System 147
8.3 Internet of Things 148
8.3.1 IoT Fundamentals 148
8.3.2 IoT and the Cloud 151
8.3.3 MAS in IoT 156
Chapter 9 Summary 159
Bibliography 161
Index 179
DOMINIK RY{KO is an Assistant Professor at the Institute of Computer Science at Warsaw University of Technology. His research interests include Big Data and Distributed Artificial Intelligence. He is widely published, serves on program committees at international conferences, and is Vice President of artificial intelligence and analytics at Adform, a global ad-tech platform provider. He also spent three years at Allegro Group as the Chief Data Scientist where he oversaw Data Science activities, design and methodology of experiments, and model building.