"This is a very useful book in the domain of deep learning and the author has done a great job of bringing all the paradigms and libraries together to illustrate how they work for real big data. I am glad to have this book on my shelf." (Anna Bartkowiak, ISCB News, Vol. 68, December, 2019)
Preface
1 Introduction to R
2 Linear Algebra
2.1 Linear Algebra with R
2.1.1 Introduction
2.1.2 Matrix Notation
3 Introduction to Machine Learning and Deep Learning
3.1 Training, Validation and Test Data
3.2 Bias and Variance
3.3 Underfitting and Overfitting
3.3.1 Bayes Error
3.4 Maximum Likelihood Estimation
3.5 Quantifying Loss
3.5.1 The Cross-Entropy Loss
3.5.2 Negative Log-Likelihood
3.5.3 Entropy
3.5.4 Cross-Entropy
3.5.5 Kullback-Leibler Divergence
3.5.6 Summarizing the Measurement of Loss
4 Introduction to Neural Networks
4.1 Types of Neural Network Architectures
4.1.1 Feedforward Neural Networks (FFNNs)
4.1.2 Convolutional Neural Networks (Convnets)
4.1.3 Recurrent Neural Networks (RNNs)
4.2 Forward Propagation
4.2.1 Notations
4.2.2 Input Matrix
4.2.3 Bias matrix
4.2.4 Weight matrix for Layer-1
4.2.5 Activation function at Layer-1
4.2.6 Weights matrix of Layer-2
4.2.7 Activation function at Layer-2
4.3 Activation Functions
4.3.1 Sigmoid
4.3.2 Hyperbolic tangent (tanh)
4.3.3 Rectified Linear Unit (ReLU)
4.3.4 leakyReLU
4.3.5 Softmax
4.4 Derivatives of Activation Functions
4.4.1 Derivative of the Sigmoid
4.4.2 Derivative of the tanh
4.4.3 Derivative of the ReLU
CONTENTS
4.4.4 Derivative of the lReLU
4.4.5 Derivative of the Softmax
4.5 Loss Functions
4.6 Derivative of the Cost Function
4.6.1 Derivative of Cross Entropy Loss with Sigmoid
4.6.2 Derivative of Cross Entropy Loss with Softmax
4.7 Back Propagation
4.7.1 Backpropagate to the output layer
4.7.2 Backpropagate to the second hidden layer
4.7.3 Backpropagate to the _rst hidden layer
4.7.4 Vectorization of backprop equations
4.8 Writing a Simple Neural Network Application
4.8.1 Image Classi_cation using Sigmoid Activation Neural Network
4.8.2 Importance of Normalization
5 Deep Neural Networks
5.1 Writing a Deep Neural Network (DNN) algorithm
5.2 Implementing a DNN using Keras
6 Regularization and Hyperparameter Tuning
6.1 Initialization
6.1.1 Zero initialization
6.1.2 Random initialization
6.1.3 Xavier initialization
6.1.4 He initialization
6.2 Gradient Descent
6.2.1 Gradient Descent or Batch Gradient Descent
6.2.2 Stochastic Gradient Descent
6.2.3 Mini Batch Gradient Descent
6.3 Dealing with NaNs
6.3.1 Hyperparameters and Weight Initialization
6.3.2 Normalization
6.3.3 Using di_erent Activation functions
6.3.4 Use of NanGuardMode, DebugMode, or MonitorMode
6.4.5 RMSProp (Root Mean Square Propagation) with Momentum Optimization Update
6.4.6 Adam Optimization (Adaptive Moment Estimation) with Momentum Update
6.4.7 Vanishing Gradient and Numerical stability
6.5 Gradient Checking
6.6 Second order methods
6.7 Per-parameter adaptive learning rate methods
6.8 Annealing the learning rate
6.9 Regularization
6.9.1 Dropout Regularization
6.9.2 `2 Regularization
6.9.3 Combining dropout and `2 regularization?
6.10 Hyperparameter optimization
6.11 Evaluation
6.12 Using Keras
CONTENTS
6.12.1 Adjust epochs
6.12.2 Add batch normalization
6.12.3 Add dropout
6.12.4 Add weight regularization
6.12.5 Adjust learning rate
6.12.6 Prediction
7 Convolutional Neural Networks
8 Sequence Models
Bibliography
Abhijit Ghatak is a Data Scientist and holds an M.E. in Engineering and M.S. in Data Science from Stevens Institute of Technology, USA. He began his career as a submarine engineer officer in the Indian Navy and worked on various data-intensive projects involving submarine operations and construction. Thereafter he has worked in academia, technology companies and as a research scientist in the area of Internet of Things (IoT) and pattern recognition for the European Union (EU). He has published several papers in the areas of engineering and machine learning and is currently a consultant in the area of machine learning and deep learning. His research interests include IoT, stream analytics and design of deep learning systems.
Deep Learning with R introduces deep learning and neural networks using the R programming language. The book builds on the understanding of the theoretical and mathematical constructs and enables the reader to create applications on computer vision, natural language processing and transfer learning.
The book starts with an introduction to machine learning and moves on to describe the basic architecture, different activation functions, forward propagation, cross-entropy loss and backward propagation of a simple neural network. It goes on to create different code segments to construct deep neural networks. It discusses in detail the initialization of network parameters, optimization techniques, and some of the common issues surrounding neural networks such as dealing with NaNs and the vanishing/exploding gradient problem. Advanced variants of multilayered perceptrons namely, convolutional neural networks and sequence models are explained, followed by application to different use cases. The book makes extensive use of the Keras and TensorFlow frameworks.