


Acknowledgements xiiiIntroduction xv1 From Big Data to Deep Learning 11.1 Introduction 11.2 Examples of the Use of Big Data and Deep Learning 61.3 Big Data and Deep Learning for Companies and Organizations 91.3.1 Big Data in Finance 101.3.1.1 Google Trends 101.3.1.2 Google Trends and Stock Prices 111.3.1.3 The quantmod Package for Financial Analysis 111.3.1.4 Google Trends in R 131.3.1.5 Matching Data from quantmod and Google Trends 141.3.2 Big Data and Deep Learning in Insurance 181.3.3 Big Data and Deep Learning in Industry 181.3.4 Big Data and Deep Learning in Scientific Research and Education 201.3.4.1 Big Data in Physics and Astrophysics 201.3.4.2 Big Data in Climatology and Earth Sciences 211.3.4.3 Big Data in Education 211.4 Big Data and Deep Learning for Individuals 211.4.1 Big Data and Deep Learning in Healthcare 211.4.1.1 Connected Health and Telemedicine 211.4.1.2 Geolocation and Health 221.4.1.3 The Google Flu Trends 231.4.1.4 Research in Health and Medicine 261.4.2 Big Data and Deep Learning for Drivers 281.4.3 Big Data and Deep Learning for Citizens 291.4.4 Big Data and Deep Learning in the Police 301.5 Risks in Data Processing 321.5.1 Insufficient Quantity of Training Data 321.5.2 Poor Data Quality 321.5.3 Non-Representative Samples 331.5.4 Missing Values in the Data 331.5.5 Spurious Correlations 341.5.6 Overfitting 351.5.7 Lack of Explainability of Models 351.6 Protection of Personal Data 361.6.1 The Need for Data Protection 361.6.2 Data Anonymization 381.6.3 The General Data Protection Regulation 411.7 Open Data 43Notes 442 Processing of Large Volumes of Data 492.1 Issues 492.2 The Search for a Parsimonious Model 502.3 Algorithmic Complexity 512.4 Parallel Computing 512.5 Distributed Computing 522.5.1 MapReduce 532.5.2 Hadoop 542.5.3 Computing Tools for Distributed Computing 552.5.4 Column-Oriented Databases 562.5.5 Distributed Architecture and "Analytics" 572.5.6 Spark 582.6 Computer Resources 602.6.1 Minimum Resources 602.6.2 Graphics Processing Units (GPU) and Tensor Processing Units (TPU) 612.6.3 Solutions in the Cloud 622.7 R and Python Software 622.8 Quantum Computing 67Notes 683 Reminders of Machine Learning 713.1 General 713.2 The Optimization Algorithms 743.3 Complexity Reduction and Penalized Regression 853.4 Ensemble Methods 893.4.1 Bagging 893.4.2 Random Forests 893.4.3 Extra-Trees 913.4.4 Boosting 923.4.5 Gradient Boosting Methods 973.4.6 Synthesis of the Ensemble Methods 1003.5 Support Vector Machines 1003.6 Recommendation Systems 105Notes 1084 Natural Language Processing 1114.1 From Lexical Statistics to Natural Language Processing 1114.2 Uses of Text Mining and Natural Language Processing 1134.3 The Operations of Textual Analysis 1144.3.1 Textual Data Collection 1154.3.2 Identification of the Language 1154.3.3 Tokenization 1164.3.4 Part-of-Speech Tagging 1174.3.5 Named Entity Recognition 1194.3.6 Coreference Resolution 1244.3.7 Lemmatization 1244.3.8 Stemming 1294.3.9 Simplifications 1294.3.10 Removal of StopWords 1304.4 Vector Representation andWord Embedding 1324.4.1 Vector Representation 1324.4.2 Analysis on the Document-Term Matrix 1334.4.3 TF-IDF Weighting 1424.4.4 Latent Semantic Analysis 1444.4.5 Latent Dirichlet Allocation 1524.4.6 Word Frequency Analysis 1604.4.7 Word2Vec Embedding 1624.4.8 GloVe Embedding 1744.4.9 FastText Embedding 1764.5 Sentiment Analysis 180Notes 1845 Social Network Analysis 1875.1 Social Networks 1875.2 Characteristics of Graphs 1885.3 Characterization of Social Networks 1895.4 Measures of Influence in a Graph 1905.5 Graphs with R 1915.6 Community Detection 2005.6.1 The Modularity of a Graph 2015.6.2 Community Detection by Divisive Hierarchical Clustering 2025.6.3 Community Detection by Agglomerative Hierarchical Clustering 2035.6.4 Other Methods 2045.6.5 Community Detection with R 2055.7 Research and Analysis on Social Networks 2085.8 The Business Model of Social Networks 2095.9 Digital Advertising 2115.10 Social Network Analysis with R 2125.10.1 Collecting Tweets 2135.10.2 Formatting the Corpus 2155.10.3 Stemming and Lemmatization 2165.10.4 Example 2175.10.5 Clustering of Terms and Documents 2255.10.6 Opinion Scoring 2305.10.7 Graph of Terms with Their Connotation 231Notes 2346 Handwriting Recognition 2376.1 Data 2376.2 Issues 2386.3 Data Processing 2386.4 Linear and Quadratic Discriminant Analysis 2436.5 Multinomial Logistic Regression 2456.6 Random Forests 2466.7 Extra-Trees 2476.8 Gradient Boosting 2496.9 Support Vector Machines 2536.10 Single Hidden Layer Perceptron 2586.11 H2O Neural Network 2626.12 Synthesis of "Classical" Methods 267Notes 2687 Deep Learning 2697.1 The Principles of Deep Learning 2697.2 Overview of Deep Neural Networks 2727.3 Recall on Neural Networks and Their Training 2747.4 Difficulties of Gradient Backpropagation 2847.5 The Structure of a Convolutional Neural Network 2867.6 The Convolution Mechanism 2887.7 The Convolution Parameters 2907.8 Batch Normalization 2927.9 Pooling 2937.10 Dilated Convolution 2957.11 Dropout and DropConnect 2957.12 The Architecture of a Convolutional Neural Network 2977.13 Principles of Deep Network Learning for Computer Vision 2997.14 Adaptive Learning Algorithms 3017.15 Progress in Image Recognition 3047.16 Recurrent Neural Networks 3127.17 Capsule Networks 3177.18 Autoencoders 3187.19 Generative Models 3227.19.1 Generative Adversarial Networks 3237.19.2 Variational Autoencoders 3247.20 Other Applications of Deep Learning 3267.20.1 Object Detection 3267.20.2 Autonomous Vehicles 3337.20.3 Analysis of Brain Activity 3347.20.4 Analysis of the Style of a PictorialWork 3367.20.5 Go and Chess Games 3387.20.6 Other Games 340Notes 3418 Deep Learning for Computer Vision 3478.1 Deep Learning Libraries 3478.2 MXNet 3498.2.1 General Information about MXNet 3498.2.2 Creating a Convolutional Network with MXNet 3508.2.3 Model Management with MXNet 3618.2.4 CIFAR-10 Image Recognition with MXNet 3628.3 Keras and TensorFlow 3678.3.1 General Information about Keras 3708.3.2 Application of Keras to the MNIST Database 3718.3.3 Application of Pre-Trained Models 3758.3.4 Explain the Prediction of a Computer Vision Model 3798.3.5 Application of Keras to CIFAR-10 Images 3828.3.6 Classifying Cats and Dogs 3938.4 Configuring a Machine's GPU for Deep Learning 4098.4.1 Checking the Compatibility of the Graphics Card 4108.4.2 NVIDIA Driver Installation 4108.4.3 Installation of Microsoft Visual Studio 4118.4.4 NVIDIA CUDA To34olkit Installation 4118.4.5 Installation of cuDNN 4128.5 Computing in the Cloud 4128.6 PyTorch 4198.6.1 The Python PyTorch Package 4198.6.2 The R torch Package 425Notes 4319 Deep Learning for Natural Language Processing 4339.1 Neural Network Methods for Text Analysis 4339.2 Text Generation Using a Recurrent Neural Network LSTM 4349.3 Text Classification Using a LSTM or GRU Recurrent Neural Network 4409.4 Text Classification Using a H2O Model 4529.5 Application of Convolutional Neural Networks 4569.6 Spam Detection Using a Recurrent Neural Network LSTM 4609.7 Transformer Models, BERT, and Its Successors 461Notes 47910 Artificial Intelligence 48110.1 The Beginnings of Artificial Intelligence 48110.2 Human Intelligence and Artificial Intelligence 48610.3 The Different Forms of Artificial Intelligence 48810.4 Ethical and Societal Issues of Artificial Intelligence 49310.5 Fears and Hopes of Artificial Intelligence 49610.6 Some Dates of Artificial Intelligence 499Notes 502Conclusion 505Note 506Annotated Bibliography 507On Big Data and High Dimensional Statistics 507On Deep Learning 509On Artificial Intelligence 511On the Use of R and Python in Data Science and on Big Data 512Index 515
Stéphane Tufféry, PhD, is Associate Professor at the University of Rennes 1, France where he teaches courses in data mining, deep learning, and big data methods. He also lectures at the Institute of Actuaries in Paris and has published several books on data mining, deep learning, and big data in English and French.
1997-2025 DolnySlask.com Agencja Internetowa





