Chapter Goal: To understand what is R, why use R, statistics in data mining and data science
No of pages 15
Sub -Topics
1. What is R?
2. High Level and Low Level Language
3. What is Statistics?
4. What is Data Science?
5. What is Data Mining?
6. What is Text Mining?
7. Three Types of Analytics
8. Big Data
9. Why R?
10. Conclusion
Chapter 2: Getting Started
Chapter Goal: To set up the computer for R Programming
No of pages: 15
Sub - Topics
1. What is R and RStudio?
2. Installation of R and RStudio
3. Integrated Development Environment
4. RStudio – The IDE for R.
5. Conclusion
Chapter 3: Basic Syntax
Chapter Goal: To learn R programming basics
No of pages : 30
Sub - Topics:
1. Writing in R Console
2. Using Code Editor
3. Variables and Data Types
4. Vectors
5. Lists
6. Data Frame
7. Logical Statements
8. Loops
9. Functions
10. Conclusion
Chapter 4: Descriptive Statistics
Chapter Goal: To learn Descriptive Statistics in R
No of pages: 20
Sub - Topics:
1. Reading Data Files
2. Mean, Median, Min, Max, …
3. Percentile, Standard Deviations
4. The Summary() and Str() functions
5. Distributions
6. Conclusion
Chapter 5: Data Visualizations
Chapter Goal: To learn Data Visualizations in R
No of pages: 20
Sub - Topics:
1. What is Data Visualizations?
2. Bar Chart, Histogram
3. Line Chart, Pie Chart
4. Scatterplot and Box Plot
5. Scatterplot Matrix
6. Decision Trees
7. Conclusion
Chapter 6: Inferential Statistics and Regressions
Chapter Goal: To learn inferential statistics and regressions in R
No of pages: 20
Sub - Topics:
1. Correlations
2. T Test, Chi Square, ANOVA
3. Non Parametric Test
4. Linear Regressions
5. Multiple Linear Regressions
Eric Goh is a data scientist, software engineer, adjunct faculty and entrepreneur with years of experiences in multiple industries. His varied career includes data science, data and text mining, natural language processing, machine learning, intelligent system development, and engineering product design.Eric Goh has been leading his teams for various industrial projects, including the advanced product code classification system project which automates Singapore Custom’s trade facilitation process, and Nanyang Technological University's data science projects where he develop his own DSTK data science software. He has years of experience in C#, Java, C/C++, SPSS Statistics and Modeller, SAS Enterprise Miner, R, Python, Excel, Excel VBA and etc. He won Tan Kah Kee Young Inventors' Merit Award and Shortlisted Entry for TelR Data Mining Challenge. Eric Goh founded the SVBook website to offer affordable books, courses and software in data science and programming.
He holds a Masters of Technology degree from the National University of Singapore, an Executive MBA degree from U21Global (currently GlobalNxt) and IGNOU, a Graduate Diploma in Mechatronics from A*STAR SIMTech (a national research institute located in Nanyang Technological University), and Coursera Specialization Certificate in Business Statistics and Analysis from Rice University. He possessed a Bachelor of Science degree in Computing from the University of Portsmouth after National Service. He is also a AIIM Certified Business Process Management Master (BPMM), GSTF certified Big Data Science Analyst (CBDSA), and IES Certified Lecturer.
Gain the R programming language fundamentals for doing the applied statistics useful for data exploration and analysis in data science and data mining. This book covers topics ranging from R syntax basics, descriptive statistics, and data visualizations to inferential statistics and regressions. After learning R’s syntax, you will work through data visualizations such as histograms and boxplot charting, descriptive statistics, and inferential statistics such as t-test, chi-square test, ANOVA, non-parametric test, and linear regressions.
Learn R for Applied Statistics is a timely skills-migration book that equips you with the R programming fundamentals and introduces you to applied statistics for data explorations.
You will:
Discover R, statistics, data science, data mining, and big data
Master the fundamentals of R programming, including variables and arithmetic, vectors, lists, data frames, conditional statements, loops, and functions
Work with descriptive statistics
Create data visualizations, including bar charts, line charts, scatter plots, boxplots, histograms, and scatterplots
Use inferential statistics including t-tests, chi-square tests, ANOVA, non-parametric tests, linear regressions, and multiple linear regressions