Foundations Of Statistics With R
Preface
0.1
Further reading
0.2
Installing R and RStudio
1
Data in R
1.1
Arithmetic and Variable Assignment
1.2
Vectors
1.2.1
Creating vectors
1.3
Indexing Vectors
1.4
Data Types
1.4.1
Missing data
1.5
Data Frames
1.6
Reading data from files
1.7
Packages
1.8
Errors and Warnings
1.9
Useful Idioms
Vignette: Data science communities
Exercises
2
Probability
2.1
Probability Basics
2.2
Simulations
2.2.1
Simulation with
sample
2.2.2
Using simulation to compute probabilities
2.2.3
Using
replicate
to repeat experiments
2.3
Conditional Probability and Independence
2.3.1
Independent Events
2.3.2
Simulating Conditional Probability
2.4
Counting Arguments
Exercises
3
Random Variables
3.1
Discrete random variables
3.2
Expected value
3.3
Binomial and geometric random variables
3.3.1
Binomial
3.3.2
Geometric
3.4
Continuous random variables
3.4.1
Expected value of a continuous random variable
3.5
Functions of a random variable
3.6
Variance and standard deviation
3.7
Normal random variables
3.7.1
Computations with normal random variables
3.7.2
Normal approximation to the binomial
3.8
Other special random variables
3.8.1
Poisson and exponential random variables
3.8.2
Uniform random variables
3.8.3
Negative binomial
3.8.4
Hypergeometric
3.9
Independent random variables
3.10
Summary
Vignette: An R Markdown Primer
Exercises
4
Simulation of Random Variables
4.1
Estimating probabilities of rvs via simulation.
4.2
Estimating discrete distributions
4.3
Estimating continuous distributions
4.4
Theorems about transformations of random variables
4.5
The Central Limit Theorem
4.6
Sampling Distributions
4.6.1
Linear combination of normal rv’s
4.6.2
Chi-squared
4.6.3
The
\(t\)
distribution
4.6.4
The F distribution
4.6.5
Summary
4.7
Point Estimators
4.7.1
Properties of Point Estimators
4.7.2
Variance of Unbiased Estimators
Exercises
5
Data Manipulation
5.1
Data frames and tibbles
5.2
dplyr verbs
5.3
dplyr pipelines
5.3.1
Group by and summarize
5.4
The power of dplyr
5.5
Working with character strings
5.6
The structure of data
5.6.1
Tidy data: pivoting
5.6.2
Using join to merge data frames
5.7
The apply family
Vignette: dplyr Murder Mystery
Vignette: Data and gender
Exercises
6
Data Visualization with ggplot
6.1
Scatterplots and Boxplots
6.1.1
Scatterplots
6.1.2
Boxplots
6.2
Visualizing Distributions
6.2.1
Histograms
6.2.2
QQ plots
6.2.3
Plotting pmfs
6.2.4
Plotting functions
6.3
Long Data
Vignette: Choropleth Maps
Exercises
7
Inference on the Mean
7.1
Notation and Theory
7.2
Confidence intervals for the mean
7.3
Hypothesis Tests of the Mean
7.4
One-sided Confidence Intervals and Hypothesis Tests
7.5
Simulations
7.5.1
Symmetric, light tailed
7.5.2
Skew
7.5.3
Heavy tails and outliers
7.5.4
Summary
7.6
Two sample hypothesis tests of
\(\mu_1 = \mu_2\)
7.7
Type II errors and power
7.7.1
Effect size
Vignette: A Permutation Test
Exercises
8
Rank Based Tests
8.1
One sample Wilcoxon Signed Rank Test
8.2
Two Sample Wilcoxon Rank Sum test
8.2.1
Example
8.2.2
Ordinal Data
8.2.3
Discussion
8.3
Simulations
8.4
Robustness, power, and sample size
8.5
Consistency
8.6
Summary
Vignette: ROC Curves and the Wilcoxon Rank Sum Statistic
Exercises
9
Tabular Data
9.1
Tables and plots
9.2
Inference on a proportion
9.2.1
Exact binomial test
9.2.2
One sample test of proportions
9.3
\(\chi^2\)
tests
9.3.1
Given probabilities
9.4
\(\chi^2\)
goodness of fit
9.4.1
Simulations
9.5
\(\chi^2\)
test of independence
9.5.1
Two sample test for equality of proportions
9.6
Exact and Monte Carlo methods
Vignette: Tables
Exercises
10
Simple Linear Regression
10.1
Fitting a line
10.1.1
An introductory example
10.1.2
Least squares
10.2
Regression and predictions
10.2.1
Predictions
10.2.2
Confidence intervals
10.3
Correlation
10.4
Geometry of regression
10.5
Inference
10.5.1
The summary command
10.5.2
Inference on the slope
10.6
Residual analysis
10.6.1
Linearity
10.6.2
Heteroscedasticity
10.6.3
Outliers and leverage
10.7
Simulations
10.7.1
Residuals
10.7.2
Prediction intervals
Vignette: Simple Logistic Regression
Exercises
11
Analysis of Variance and Comparison of Multiple Groups
11.1
Setup
11.2
ANOVA
11.3
Examples
11.3.1
Humanization
11.3.2
Mice on THC
11.3.3
Chimpanzees (unequal variance)
11.4
Examining the hypotheses
11.5
Pairwise t-tests and FWER
Vignette: Reproducibility
Exercises
12
Multiple Regression
12.1
Two explanatory variables
12.2
Categorical Variables
12.3
Variable Selection
Vignette: Data in Other Formats
Exercises
References
Published with bookdown
Foundations of Statistics with R
References