2  Statistical Thinking

  1. Why statistics exists?

Core idea: The world is uncertain.

The world is full of variation. Even when nothing “mystical” is happening, numbers don’t line up perfectly.

So: what is true?

Statistics is the tool we use to:

It is not about proving things absolutely true. It is about saying:

“Given what I saw, how confident am I about what is happening?”

Example

Imagine you want to know if a coin is fair. You toss it 10 times, get 7 heads.

Heads = 1, Tails = 0

set.seed(2025)

# Toss a fair coin 10 times
sample1 <- rbinom(10, 1, 0.5)
mean(sample1)  # proportion of heads
[1] 0.7
# Toss a fair coin 1000 times
sample2 <- rbinom(1000, 1, 0.5)
mean(sample2)
[1] 0.504

Does that mean the coin is unfair?

Not necessarily. The chance could easily give you 7 heads from a fair coin.

But if you tossed it 1,000 times and got 700 heads, now the story is different.

Statistics gives us a way to make that judgment systematically.

Every experiment reflects:

Mathematics solves for certainty. Statistics quantifies uncertainty so we can make decisions despite it.

Variation is everywhere. If we don’t use statistics, we risk:

That’s why statistics exists: not to give “exact truth,” but to help us make good decisions under uncertainty.

There is a bigger problem. We almost never have access to everything we care about. Instead, we only see a slice of the bigger world.

This brings us to the heart of statistical thinking: We want to learn about a population (the big world) using just a sample (the slice we observed).