ISBN-13: 9783319877884 / Angielski / Miękka / 2019 / 367 str.
ISBN-13: 9783319877884 / Angielski / Miękka / 2019 / 367 str.
1 Notation and conventions 9
1.0.1 Background Information........................................................................ 10
1.1 Acknowledgements................................................................................................. 11I Describing Datasets
2 First Tools for Looking at Data 13
2.1 Datasets....................................................................................
................................... 13 2.2 What’s Happening? - Plotting Data................................................................. 152.2.1 Bar< Charts.................................................................................................... 16
2.2.2 Histograms................................................................................................... 16
2.2.3 How to Make Histograms...................................................................... 17
2.2.4 Conditional Histograms.......................................................................... 19
2.3 Summarizing 1D Data.................................
........................................................... 192.3.1 The Mean...................................................................................................... 20
2.3.2 Standard Deviation................................................................................... 222.3.3 Computing Mean and Standard Deviation Online...................... 26
2.3.4 Variance......................................................................................................... 26
2.3.5 The Median.................................................................................................. 27
2.3.6 Interqu
artile Range.................................................................................. 292.3.7 Using Summaries Sensibly.................................................................... 30
2.4 Plots and Summaries............................................................................................. 31
2.4.1 Some Properties of Histograms.......................................................... 312.4.2 Standard Coordinates and Normal Data......................................... 34
2.4.3 Box Plots....................................................................................................... 38
2.5 Whose is bigger? Inves
tigating Australian Pizzas...................................... 392.6 You should.................................................................................................................. 43
2.6.1 remember these definitions:................................................................. 43
2.6.2 remember these terms............................................................................ 43
2.6.3 remember these facts:............................................................................. 43
2.6.4 be able to...................................................................................................... 43
3 Looking at Relationships
473.1 Plotting 2D Data...................................................................................................... 47
3.1.1
3.1.2 Series...............................
............................................................................... 513.1.3 Scatter Plots for Spatial Data.............................................................. 53
3.1.4 Exposing Relationships with Scatter Plots..................................... 54
3.2 Correlation.................................................................................................................. 57
3.2.1 The Correlation Coefficient................................................................... 60
3.2.2 Using Correlation to Predict................................................................ 643.2.3 Confusion caused by co
rrelation......................................................... 681
3.4 You should.................................................................................................................. 72
3.4.1 remember these definitions:................................................................. 72
3.4.2 remember these terms............................................................................ 72
remember these facts: . .
. . .
3.4.4
use these procedures: . . .
. . .
3.4.5
be able to: . . . . . . . . .
. . .
. . . . . . . . . . . . . . . . . 72
. . . . . . . . . . . . . . . . . 72
. . . . . . . . . . . . . . . . . 72
II Probability &
nbsp; 784 Basic ideas in probability 79
4.1 Experiments, Outcomes and Probability....................................................... 79
4.1.1 Outcomes and Probability...................................................................... 794.2 Events.................
.......................................................................................................... 814.2.1 Computing Event Probabilities by Counting Outcomes............. 83
4.2.2 The Probability of Events...................................................................... 87
4.2.3 Computing Probabilities by Reasoning about Sets...................... 89
4.3 Independence............................................................................................................ 92
4.3.1 Example: Airline Overbooking............................................................ 96
4.4 Conditional ...........................................
............. 994.4.1 Evaluating Conditional Probabilities.............................................. 100
4.4.2 Detecting Rare Events is Hard......................................................... 104
4.4.3 Conditional Probability and Various Forms of Independence . 106 4.4.4 The Prosecutor’s Fallacy 1084.4.5 Example: The Monty Hall Problem................................................ 110
4.5 Extra Worked Examples.................................................................................... 1124.5.1 Outcomes and Probability........................................
........................... 1124.5.2 Events.......................................................................................................... 114
4.5.3 Independence........................................................................................... 115
4.5.4 Conditional Probability......................................................................... 117
4.6 You should............................................................................................................... 121
4.6.1 remember these definitions:.............................................................. 121
4.6.2 remember these terms........
................................................................. 121 4.6.3 remember and use these facts.......................................................... 1214.6.4 remember these points:....................................................................... 121
4.6.5 be able to.................................................................................................... 121
5 Random Variables and Expectations
128 5.1 Random Variables................................................................................................. 1285.1.1 Joint and Conditional Probability for Random Variables . . . 131
5.1.2 Just a Little Continuous Probability............................................... 1345.2 Expectations and Expected Values................................................................ 137
5.2.1 Expected Values...................................................................................... 138
5.2.2 Mean, Variance and Covariance....................................................... 141
5.2.3
Expectations and Statistics................................................................. 145 5.3 The Weak Law of Large Numbers................................................................ 145
5.3.1
IID Samples . . . . . . .
. .. .
. . . .
. . . . .. . .
. . . .
. 145
5.3.2Two Inequalities . . . .
. .
. .. . . .
. . . . .
. . .
. . . .<
. 146
5.3.3
Proving the Inequalities
. .
. .
. . .. . . . .
. . .
. . . .
. 147
5.3.4 The Weak Law of Large Numbers.................................................. 149
5.4 Using the Weak Law of Large Numbers 151
5.4.1 Should you accept a bet?..................................................................... 1515.4.2 Odds, Expectations and Bookmaking — a Cultural Diversion 152 5.4.3 Ending a Game Early 154
5.4.4 Making a Decision with Decision Trees and
Expectations . . 154 5.4.5 Utility 156 5.5 You should................................................................................... 1595.5.1 remember these definitions:.............................................................. 159
5.5.2 remember these terms......................................................................... 159
5.5.3 use and remember these facts.......................................................... 1595.5.4 be able to.................................................................................................... 160
6 Useful Probability Distributions
; 1676.1 Discrete Distributions 167
6.1.1 The Discrete Uniform Distribution................................................. 167
6.1.2 Bernoulli Random Variables..........................................................
..... 1686.1.3 The Geometric Distribution................................................................ 168
6.1.4 The Binomial Probability Distribution........................................... 1696.1.5 Multinomial probabilities..................................................................... 171
6.1.6 The Poisson Distribution..................................................................... 172
6.2 Continuous Distributions
; 1746.2.1 The Continuous Uniform Distribution........................................... 174
6.2.2 The Beta Distribution........................................................................... 174
6.2.3 The Gamma Distribution..................................................................... 176
6.2.4 The Exponential Distribution............................................................ 176
6.3 The Normal Distribution ; 178 6.3.1 The Standard Normal Distribution................................................. 1786.3.2 The Normal Distribution..................................................................... 179
6.3.3 Properties of The Normal Distribution......................................... 180
6.4 Approximating Binomials with Large N 182
6.4.1 Large N..............................................................
......................................... 1836.4.2 Getting Normal<........................................................................................ 185
6.4.3 Using a Normal Approximation to the Binomial Distribution 187
6.5
You should . . . . . . . . . . . . .
. .
. . . .
. . . . .
. . .
. .
6.5.1 remember these definitions:
. .
. . . .
. . . . .
. . .
. .6.5.2
remember these terms: .
. .
. .
. . . .
. . . . .
. . .
. .
6.5.3
remember these facts: .
. .. .
. . . .
. . . . .
. . .
. .
6.5.4
remember these points:
. .
. .
.
. . . . . . . .. . .
. .<
. . . 188
. . . 188
. . . 188
. . . 188
. . . 188
III Inference
7 Samples and Populations 197
7.1 The Sample Mean................................................................................................. 197
7.1.1 The Sample Mean is an Estimate of the Population Mean . . 197
7.1.2 The Varianc
e of the Sample Mean.................................................. 1987.1.3 When The Urn Model Works............................................................ 201
7.1.4 Distributions are Like Populations................................................. 202
7.2 Confidence Intervals............................................................................................ 203
7.2.1 Constructing Confidence Intervals.................................................. 203
7.2.2 Estimating the Variance of the Sample Mean............................ 2047.2.3 The Probability Distribution of the Sample Mean..................... 206 &
lt;7.2.4 Confidence Intervals for Population Means................................. 208
7.2.5 Standard Error Estimates from Simulation................................. 212
7.3 You should............................................................................................................... 216
7.3.1 remember these definitions:.............................................................. 216
7.3.2 remember these terms......................................................................... 216
7.3.3 remember these facts:........................................................................... 216
7.3.4 use these procedures............................................................................. 2167.3.5 be able to.................................................................................................... 216
8 The Significance of Evidence 221
8.1 Significance..............................................................
................................................ 222 8.1.1 Evaluating Significance......................................................................... 2238.1.2 P-values....................................................................................................... 225
8.2 Comparing the Mean of Two Populations.................................................. 230
8.2.1 Assuming Known Population Standard Deviations................... 231
8.2.2 Assuming Same, Unknown Population Standard Deviation . 233
8.2.3 Assuming Different, Unknown Population Stand
ard Deviation 235 8.3 Other Useful Tests of Significance................................................................. 2378.3.1 F-tests and Standard Deviations...................................................... 237
8.3.2 χ2 Tests of Model Fit............................................................................ 239
8.4 Dangerous Behavior............................................................................................. 244
8.5 You should............................................................................................................... 246
8.5.1 remember these definitions:......................................
........................ 2468.5.2 remember
8.5.3
remember these facts: . .
. . .
8.5.4
use these procedures: . . .
. . .
8.5.5
be able to: . . . . . . . . .
. . .
. . . . . . . . . . . . . . . . . 246
. . . . . . . . . . . . . . . . . 246
9 Experiments &nbs
p; 2519.1 A Simple Experiment: The Effect of a Treatment.................................. 251
9.1.1 Randomized Balanced Experiments............................................... 252
9.1.2 Decomposing Error in Predictions.................................................. 2539.1.3 Estimating the Noise Variance......................................................... 253
9.1.4 The ANOVA Table.................................................................................. 255
9.1.5 Unbalanced Experiments.................................................................... 257
9.1.6 Significant Differences.......................................................................... 259
9.2 Two Factor Experiments.................................................................................... 261
9.2.1 &n
9.2.2 Interaction Between Effects................................................................ 265
9.2.3 The Effects of a Treatment................................................................. 266
9.2.4 Setting up an ANOVA Table.............................................................. 2679.3 You should............................................................................................................... 272
9.3.1 remember these definitions:.............................................................. 272
9.3.2
remember these terms......................................................................... 272 9.3.3 remember these facts:........................................................................... 2729.3.4 use these procedures............................................................................. 272
9.3.5 be able to.................................................................................................... 272
9.3.6 Two-Way Experiments.......................................................................... 274
10 Inferring Probability Models from Data &n
bsp; 27510.1 Estimating Model Parameters with Maximum Likelihood.................. 275
10.1.1 The Maximum Likelihood Principle............................................... 277
10.1.2 Binomial, Geometric and Multinomial Distributions................ 278
10.1.3 Poisson and Normal Distributions................................................... 281
10.1.4 Confidence Intervals for Model Parameters................................ 28610.1.5 Cautions about Maximum Likelihood............................................ 288
10.2 Incorporating Prio
rs with Bayesian Inference.......................................... 28910.2.1 Conjugacy................................................................................................... 292
10.2.2 MAP Inference......................................................................................... 294
10.2.3 Cautions about Bayesian Inference................................................. 296
10.3 Bayesian Inference for Normal Distributions............................................ 296
10.3.1 Example: Measuring Depth of a Borehole................................... 296
10.3.2 Normal Prior and Normal Likelihood Yield Normal Posterior 297
10.3.3 Filtering....................................
.................................................................. 30010.4 You should............................................................................................................... 303
10.4.1 remember these definitions:.............................................................. 303
10.4.2 remember these terms......................................................................... 303
10.4.3 remember these facts:........................................................................... 304
10.4.4 use these procedures............................................................................. 304
10.4.5 be able to.................................................................................................... 304&
lt;IV Tools 312
11 Extracting Important Relationships in High Dimensions 313
11.1 Summaries and Simple Plots........................................................................... 31311.1.1 The Mean......................
............................................................................. 31411.1.2 Stem Plots and Scatterplot Matrices.............................................. 315
11.1.3 Covariance.................................................................................................. 317
11.1.4 The Covariance Matrix......................................................................... 319
11.2 Using Mean and Covariance to Understand High Dimensional Data . 321 11.2.1 Mean and Covariance under Affine Transformations............... 322
11.2.2
. .
324
. .
. .
326
. .
327
. .
329
.
332
. .
334
. .
335
. .
335
. .338
. .
339
. .341
. .
<345
. .345
. .
345
. .345
. .
345
. .
345
349
. .
349
. .
350
. .
350
. .
351
. .
351. .
353
. .
355. .
357
. .
358. .
359
. .
<360
.< .
361
<
Eigenvectors and Diagonalization . . . . . . . . . . . . . .11.2.3 Diagonalizing Covariance by Rotating Blobs . . . . . . . .
11.2.4 Approximating Blobs . . . . . . . . . . . . . . . . . . . .
11.2.5 Example: Transforming the Height-Weight Blob . . . . .
11.3 Principal Components Analysis . . . . . . . . . . . . . . . . . . .
11.3.1 Example: Representing Colors with Principal Components
11.3.2 Example: Representing Faces
with Principal Components11.4 Multi-Dimensional Scaling . . . . . . . . . . . . . . . . . . . . . .
11.4.1 Choosing Low D Points using High D Distances . . . . . .
11.4.2 Factoring a Dot-Product Matrix . . . . . . . . . . . . . .
11.4.3 Example: Mapping with Multidimensional Scaling . . . .
11.5 Example: Understanding Height and Weight . . . . . . . . . . . 11.6 You should . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11.6.1 remember these definitions: . . . . . . . . . . . . . . . . . 11.6.2 remember these terms: . . . . . . . . . . . . . . . . . . . .11.6.3 remember these facts: .&
nbsp; . . . . . . . . . . . . . . . . . . .11.6.4 use these procedures: . . . . . . . . . . . . . . . . . . . . . 11.6.5 be able to: . . . . . . . . . . . . . . . . . . . . . . . . . . .
12 Learning to Classify
12.1 Classification: The Big Ideas . . . . . . . . . . . . . . . . . . . . 12.1.1 The Error Rate . . . . . . . . . . . . . . . . . . . . . . . . 12.1.2 Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.3 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . .
12.1.4 Is the Classifier Working Well? . . . . . . . . . . . . . . .
12.2 Classifying with Nearest Neighbors . . . . .
. . . . . . . . . . . . 12.3 Classifying with Naive Bayes . . . . . . . . . . . . . . . . . . . . 12.3.1 Missing Data . . . . . . . . . . . . . . . . . . . . . . . . .12.4 The Support
12.4.1 Choosing a Classifier with the Hinge Loss . . . . . . . . .
12.4.2 Finding a Minimum: General Points . . . . . . . . . . . .
12.4.3 Finding a Minimum: Stochastic Gradient Descent . . . .
12.4.4 Example: Training an SVM with Stochastic Gradient Descent 363
12.4.5 Multi-Class Classification with SVMs.............................................. 36612.5 Classifying with Random Forests................................................................... 367
12.5.1 Building a Decision Tree..................................................................... 367
12.5.2 Choosing a Split with Information Gain........................................ 370
12.5.3 Forests......................................................................................................... 373
12.5.4 Building and Evaluating a Decision Forest.................................. 37412.5.5 Classifying Data Items with a Decision Forest........................... 375
12.6 You should...................................................................
............................................ 378 12.6.1 remember these definitions:.............................................................. 37812.6.2 remember these terms......................................................................... 378
12.6.3 remember these facts:........................................................................... 379
12.6.4 use these procedures............................................................................. 379
12.6.5 be able to.................................................................................................... 379
<
13.1 The Curse of Dimension........................................................................
13.1.2 Minor Banes of Dimension.................................................................. 386
13.2 The Multivariate Normal Distribution......................................................... 38713.2.1 Affine Transformations and Gaussians.......................................... 387
13.2.2 Plotting a 2D Gaussian: Covariance Ellipses.............................. 38813.3 Agglomerative and Divisive Clustering........................................................ 389
13.3.1 Clustering and Distance....................................................................... 391
13.4 &
nbsp; The K-Means Algorithm and Variants......................................................... 39213.4.1 How to choose K...................................................................................... 395
13.4.2 Soft Assignment....................................................................................... 397
13.4.3 General Comments on K-Means....................................................... 400
13.4.4 K-Mediods.................................................................................................. 400
13.5 Application Example: Clustering Documents........................................... 401
13.5.1 A Topic Model......................................................................................
.... 402 13.6 Describing Repetition with Vector Quantization...................................... 40313.6.1 Vector Quantization............................................................................... 404
13.6.2 Example: Groceries in Portugal....................................................... 406
13.6.3 Efficient Clustering and Hierarchical K Means.......................... 409
13.6.4 Example: Activity from Accelerometer Data............................... 409
13.7 You should............................................................................................................... 413
13.7.1 remember these definitions:.............................................................. 413
13.7.2 remember these terms......................................................................... 41313.7.3 remember these facts:........................................................................... 413
13.7.4 use these procedures............................................................................. 41314 Regression &nbs
p; 41714.1.1 Regression to Make Predictions....................................................... 417
14.1.2 Regression to Spot Trends.................................................................. 419
14.1 Linear Regression and Least Squares.......................................................... 42114.1.1 Linear Regression................................................................................... 421
14.1.2 Choosing β.................................................................................................. 42214.1.3 Solving the Least Squares Problem................................................ 423
14.1.4 &n
bsp; Residuals..................................................................................................... 42414.1.5 R-squared.................................................................................................... 424
14.2 Producing Good Linear Regressions............................................................. 427
14.2.1 Transforming Variables........................................................................ 428
14.2.2 Problem Data Points have Significant Impact............................ 431
14.2.3 Functions of One Explanatory Variable........................................ 433
14.2.4 Regularizing Linear Regressions...................................................... 435
14.3 &nbs
p; Exploiting Your Neighbors14.3.1 Using your Neighbors to Predict More than a Number............ 441
14.3.2 Example: Filling Large Holes with Whole Images.................... 441
14.4
You should . . . . . . . . . . . . . . .. . . .
. . . . .
. . .
14.4.1 remember these definitions:
. .
. . . .
. . .
. . .
14.4.2 remember these terms: . . .
. .
. . . .
. . . . .
. . .
. . . . . 444 . . . . . 444
. . . . . 444
14.4.3 remember these facts:........................................................................... 444
14.4.4 remember these procedures:............................................................. 44415 Markov Chains and Hidden Markov Models &n
bsp; 45415.1 Markov Chains........................................................................................................ 454
15.1.1 Transition Probability Matrices........................................................ 457
15.1.2 Stationary Distributions....................................................................... 45915.1.3 Example: Markov Chain Models of Text...................................... 462
15.2 Estimating Properties of Markov Chains.................................................... 465
15.2.1 Simulation....................................................
.............................................. 46515.2.2 Simulation Results as Random Variables..................................... 467
15.2.3 Simulating Markov Chains.................................................................. 469
15.3 Example: Ranking the Web by Simulating a Markov Chain................ 472
15.4 Hidden Markov Models and Dynamic Programming............................. 47315.4.1 Hidden Markov Models........................................................................ 474
15.4.2 Picturing Inference with a Trellis.................................................... 47415.4.3 Dynamic Programming for HMM’s: Formalities....................... 478
15.4.4 &nb
sp; Example: Simple Communication Errors..................................... 47815.5 You should............................................................................................................... 481
15.5.1 remember these definitions:.............................................................. 481
15.5.2 remember these terms......................................................................... 481
15.5.3 remember these facts:........................................................................... 481
15.5.4 be able to.................................................................................................... 481
V Some Mathematical Background &nb
sp; 48416 Resources 485
16.1 Useful
Material about Matrices....................................................................... 48516.1.1 The Singular Value Decomposition................................................. 486
16.1.2 Approximating A Symmetric Matrix............................................... 487
16.2 Some Special Functions..................................................................................... 489
16.3 Finding Nearest Neighbors............................................................................... 49016.4 Entropy and Information Gain........................................................................ 493
David Alexander Forsyth is Fulton Watson Copp Chair in Computer Science at the University of Illinois at Urbana-Champaign, where he is a leading researcher in computer vision.
A Fellow of the ACM (2014) and IEEE (2009), Forsyth has also been recognized with the IEEE Computer Society’s Technical Achievement Award (2005), the Marr Prize, and a prize for best paper in cognitive computer vision (ECCV 2002). Many of his former students are famous in their own right as academics or industry leaders.
He is the co-author with Jean Ponce of Computer Vision: A Modern Approach (2002; 2011), published in four languages, and a leading textbook on the topic.Among a variety of odd hobbies, he is
1997-2024 DolnySlask.com Agencja Internetowa