set.seed(2025)
# Toss a fair coin 10 times
<- rbinom(10, 1, 0.5)
sample1 mean(sample1) # proportion of heads
[1] 0.7
# Toss a fair coin 1000 times
<- rbinom(1000, 1, 0.5)
sample2 mean(sample2)
[1] 0.504
Core idea: The world is uncertain.
The world is full of variation. Even when nothing “mystical” is happening, numbers don’t line up perfectly.
So: what is true?
Statistics is the tool we use to:
It is not about proving things absolutely true. It is about saying:
“Given what I saw, how confident am I about what is happening?”
Example
Imagine you want to know if a coin is fair. You toss it 10 times, get 7 heads.
Heads = 1, Tails = 0
set.seed(2025)
# Toss a fair coin 10 times
<- rbinom(10, 1, 0.5)
sample1 mean(sample1) # proportion of heads
[1] 0.7
# Toss a fair coin 1000 times
<- rbinom(1000, 1, 0.5)
sample2 mean(sample2)
[1] 0.504
Does that mean the coin is unfair?
Not necessarily. The chance could easily give you 7 heads from a fair coin.
But if you tossed it 1,000 times and got 700 heads, now the story is different.
Statistics gives us a way to make that judgment systematically.
Every experiment reflects:
Mathematics solves for certainty. Statistics quantifies uncertainty so we can make decisions despite it.
Variation is everywhere. If we don’t use statistics, we risk:
That’s why statistics exists: not to give “exact truth,” but to help us make good decisions under uncertainty.
There is a bigger problem. We almost never have access to everything we care about. Instead, we only see a slice of the bigger world.
This brings us to the heart of statistical thinking: We want to learn about a population (the big world) using just a sample (the slice we observed).