Math/Stat 3850Foundations of Statistical Analysis
Updated

General Information
Help for this class is available (besides Dr. Clair's office hours):
 Math Help Center, Ritter 111: Hours.
 Linux Lab, Ritter 117: Hours. Help with R questions

News
This course is over.

Online Content
 March 20: syllabus changes

Zoom recording (syllabus changes)
 March 23: xbar distribution

Zoom (class start at minute 12, 37 min total material),
Slides, R Code
 March 25 (live): central limit theorem

Zoom (35 min + Q and A),
Slides,
R code
 March 25 (part 2): t distribution

Zoom (13 minutes),
Slides,
R Code
 March 27 (live): Confidence Intervals

Zoom,
Slides,
R Code.
 March 27 (part 2): Replicating CI's

Zoom (11 minutes),
R Code
 March 30 (live): Hypothesis Testing

Zoom,
Slides,
Code.
Additional Zoom help on Ch 4 questions 8,9 (8 minutes). NOTE! I did the command sd < 2 . I meant sigma < 2 . Don't do what I did because you will mess up the sd function!
 March 30 (part 2): Arbuthnot's Hypothesis Test

Zoom (11 minutes),
R Code
 April 1 (live): Onesample ttest

Zoom(50 minutes),
Slides,
R code
normtempquestions.Rmd (html version): Body temperature questions, and answers.
 April 3 (live): Paired ttest

Zoom (Class from minute 8 to minute 53),
Slides,
R code
 April 3 (part 2): Paired ttest example

Video (8 minutes),
R code
 April 6 : Exam 2 review

Zoom,
Slides,
R code
 April 15 (live): Two sample ttest

Zoom,
Slides,
R code
 April 15 (recorded): Exam 2 discussion

Zoom (14 minutes)
 April 17 (live): Two sample testing

Zoom,
Slides,
R code
 April 17 (part 2): Wilcox ranksum test

Zoom (11 minutes),
R code
 April 20 : Type I and Type II errors

Zoom,
Slides,
R code
 April 20 (part 2): A two sample test example

Zoom (10 minutes),
R code
 April 22 : Power

Zoom,
Slides,
R code
 April 22 (part 2): Power battle: t test vs. Wilcoxon

Zoom (14 minutes),
R code
 April 24: Regression

Zoom,
Slides
R code
 April 24 (part 2): Regression example: Miss America

Zoom (9 minutes),
R code
 April 27: Correlation and linear regression

Zoom,
Slides
R code
 April 27 (part 2): Predictions with regression

Zoom (9 minutes),
R code
 April 29: Testing for correlation

Zoom,
Slides
R code
 April 29 (part 2): Hypothesis testing example

Video (14 minutes),
R code
 May 1: Analysis of Variance

Zoom,
Slides,
R code
 May 4: Exam Review

Zoom,
Slides,
R code

Resources
Software and References
References
Class Examples
Exams and Quizzes

Data
 hotdogs.csv: Results of a laboratory analysis of calories and sodium content of major hot dog brands. (Source: SOCR via Consumer Reports, June 1986, pp. 366367.)
 Missouri Lottery Pick 3
 Pregnancy length, Kieler, et. al, 1995
 Mercury levels in commercial fish (Source: FDA)
 worldcupscores2014.txt: Goals scored in all World Cup matches, Brazil 2014.
 usreactors.csv (Source: USNRC)

movieLensData (Source: GroupLens Research)

time_series_19covidConfirmed.csv, Coronavirus confirmed cases. (Source:
Johns Hopkins CSSE data repository)
 Kieran Healy's OECD death by violent assault data.
 cereal.csv (Source: Kaggle, 80 Cereals)
 michelson_light_speed.csv:
Michelson's 1879 and 1882 measurements of the speed of light, given
as amounts over 299000km/s.
(Source: The Data and Story Library).
 normtemp.csv: Body temperatures and heart rate. (Source: What's Normal?, also available as normtemp in the UsingR package)
 incognitube.csv: Length (in seconds) of a random sample of YouTube videos with less than 100 views.
 missamerica.csv:
Miss America winners, 19212002.
(Source: PBS.org)
 gosset_corn.csv:
Gosset's 1908 data on the results of seeding pairs of plots with regular
and kilndried seeds.
(Source: The Data and Story Library).
 candytuft.csv: From Darwin, C. R. 1876. The effects of cross and self fertilisation in the vegetable kingdom.
It is Table XXXII and records Iberis umbellata heights. Flowers were planted in matched pairs in the same pot. 0 signifies that the Plant died.
 balance.csv: Aging and balance.
(Source: Teasdale, Bard, LaRue, and Fleury. 1993. On the cognitive penetrability of posture control. Experimental Aging Research).
 flint.csv: Data on lead in tapwater from Flint, Michigan in 2015. (Source: Flint Water Study)
 pulitzer.csv: Pulitzer prizes and newspaper circulation. (Source: fivethirtyeight.com, April 15, 2014.)
 pearsonfatherson.csv (Source: Karl Pearson, c1900).
 draft1970.rda:
Draft order for birthdates, 1970 Vietnam draft.
(Source: The Data and Story Library, see also this page)
 urination.csv: Duration of urination for mammals. (Source: Law of Urination: all mammals empty their bladders over the same duration (Table 1). Yang, Pham, Choo, Hu 2014)
 bechdel.csv: Films from 19702013 and whether they pass the Bechdel test. (Source: fivethirtyeight)
 scrabble.csv

Homework Assignments

Due Friday, January 17
Read Chapter 1.
Do Chapter 1 Exercises #2, 4, 7, 11, 12.
Gain 3500XP in DataCamp "Introduction to R". The important chapters are chapter 1, 2, 4, 5.

Due Wednesday, January 29
Read Chapter 2.
Do Chapter 2 Exercises #1, 3, 4, 5, 6, 8, 9, 10, 11, 14, 15, 16, 19, 21, 22.

Due Friday, February 7
Read Chapter 3.13.3
Do Chapter 3 Exercises #111, 32, 33.

Due Monday, 2/17
Read Chapter 3
Do Chapter 3 Exercises #12, 1417, 2327, 29, 30
Gain 3000XP in DataCamp "Foundations of Probability in R". The important sections are 1, 2, 4.

Due Friday, Feb 28
Read Chapter 5.
Do Chapter 5 Exercises #2, 514, 16, 17, 18, 23, 24, 25.
Gain 2500XP in DataCamp "Data Manipulation in R with dplyr".

Due Friday, March 6
Read Chapter 6.
Do Chapter 6 Exercises #1, 2, 3, 5a, 7, 8, 9, 10, 11, 12.
Gain 3000XP in DataCamp "Data Visualization with ggplot2 (Part 1)", Ch 14.

Due Monday, March 30
Read 4.34.6
Do Chapter 4 Exercises #8, 9
Read 77.1
Do Chapter 7 Exercises # 1, 2, 3, 4

Due Monday, April 6
Read Chapter 7  7.5
Do Chapter 7 Exercises #510, 14, 17*
* Here is a rewritten version of problem 17 that gives some steps.
Gain 700XP in Inference for Numerical Data in R in DataCamp. Only work on section 2 of this course.

Due Friday, April 24
Read Chapter 7.47.6, Chapter 8
Do Chapter 7 Exercises #11, 12, 13
Do Chapter 8 Exercises #1, 2, 4, 6, 8, 11, 12
Gain 700XP in Inference for Numerical Data in R in DataCamp. Only work on section 3 of this course.

Due Friday, May 1
Read Chapter 10
Do Chapter 10 Exercises #110.
Gain 3000XP in Correlation and Regression in DataCamp. This chapter is long, feel free to skip to the good parts.
