Per/Cent/Age

Your statistics are 98.6% wrong

Bad Stats

Ki
8 min readJul 4, 2023

--

The chance of a person answering a statistical question correctly is only 12.3%. Everyone agrees 98.6% of statistics are made up on the spot. No really, you suck at statistics, so do I, and so does your statistics professor. Human minds simply don’t deal with statistics. But, how bad is it?

Marilyn Vos Savant

Before we talk at all about statistics, we need to talk about the smartest woman in the world Marilyn vos Savant.

Back in the early 90s, she ignited the world of statistics by focusing on a mind-bender called the Monty Hall Problem. Take a moment to read that article. Multiple professors of statistics viciously attack her, telling all the ways she was wrong.

When I encountered this problem, I presented it to my team (biologists, programmers, mathematicians, and other scientists) and we could see the fun of the question, and how it was tricking most of us. What did we do? Wrote a single line of code to confirm our guesses. Yup, one line of code solved it instantly. The very second I confirmed the results I also was able to model why we were being tricked. In fact, back then I wrote up a clear simple overview so everyone could understand it instantly.

‘Remember everything is right until it’s wrong. You’ll know when it’s wrong.’ — Ernest Hemingway

He was wrong. But what interests me here is: Why did all these professors get it wrong? (and why were they so vicious towards her is also a question #mysogeny?) They could have confirmed it in one line of code.

Don’t drive on the 4th of July

Recently a friend who worries about things warned me not to drive on the 4th of July, because of how dangerous it is.

I sometimes worry about all my friends that worry about things. I mostly live on a small island, surrounded by sharks, and have to have the ‘shark’ conversation a lot. Remember, there is always a shark in the water when you get into the ocean, and also when you don’t.

But is it even really worth it to worry about sharks, or driving on the 4th?

How do we measure this?

The National Safety Council is dedicated to answering some of these questions. For the most part, we can agree cars are getting way safer. Check out the pretty graphs on their site. The numbers are massively low at this point. But still, how do we compare this?

The National Highway Traffic Safety Administration posts something intended to scare us:

‘On July 4th, America celebrates the birth of our nation. Around the holiday, sadly, we often see an increase in impaired driving crashes. From 2017 to 2021, there were 1,460 drivers killed in motor vehicle traffic crashes over the Fourth of July holiday period — 38% of the drivers killed were drunk.’

Ok, but how do we decode this? Were 1.4K killed over a 4-year period on just the 4th, added together? Or each 4th, on each year?

What does ‘often see’ mean? so sometimes you don’t? We have a lot of work to do here to decode all of this.

Not just do people not understand statistics, 99.9999934% of reporters can’t present statistical information in a meaningful way. Why did I add a 34 at the end of that number? If I had written 99.9999999 you would know it was silly, but if I make the numbers look a little random, the number suddenly appears ‘possibly’ real. But no, they— like all these stats — are pure unfiltered malarkey.

We need to delve into more data here as well, such as that 38% number. What if all or almost all the drivers killed themselves? Drove in to a tree, or off a cliff? Should the rest of us worry about driving on the roads then? Sure, they are dying, but are we safe? Safe enough?

Next up, we need to know how much worse (or better) this is than a baseline, and we have to decide what even that baseline means. Hold on… why did I say ‘better?’ Well, it is possible that even if deaths from car crashes are worse on the 4th, they might be better than other days of the week, or other holidays (and it is, or is it? Damn it, it’s not that clear).

We have to start somewhere though. This is algebraic in nature. Let’s start with some sort of metric. The numbers are all over the place. Yes, even a question as simple as ‘How many people die in automobile accidents each day on average’ is a tricky question to get answered.

All said, after reading dozens of reports, stats, and declarative sentences from organisations that are meant to study or know this stuff, the number of people that die each day in a car-related incident is about 100–120, or between 40K-50K a year. We would think we could be more accurate with these numbers, but no, we can’t. There are just too many ways to look at the data.

If you search around you will find some saying the deadliest day is the 4th of July, and others it is New Year's Eve. Some say it is the day before, and some say it is other days.

Let’s review a statistical statement from the California Traffic Safety Quick Stats site:

‘In 2021, 52.9% of all drivers killed in motor vehicle crashes, who were tested, tested positive for legal and/ or illegal drugs, a decrease of approximately 5.5% from 2020.’

Most people will remember this quote as saying about half of the people killed were on some drug. But this is not accurate at all. Or said another way, we don’t know bloody shite here really.

The key phrase is ‘who were tested.’ Let’s confirm this:

This means someone would have had to request a mortician's office to specifically check for drugs. This itself is not standard procedure. Why waste money on the dead?

Assuming it is tested, they also mentioned ‘legal drugs.’ What does this mean? Are these drugs including all drugs? Are there magical drugs that are deemed ‘probably will cause you to have a car accident?’ Do we include mood modifiers? Birth control? Heart medication? These are all drugs. do we test more for drugs on holidays, because we are also testing for alcohol? Are they counting drugs in others, people that didn’t have accidents? Or in people just walking on the sidewalk, riding a bike, etc.?

So, let’s talk about just alcohol. How long would you guess Alcohol can be tested for in a dead body? Most would assume a long time, it’s alcohol after all. But the body continues to ‘digest’ even when dead. The answer is only 48 hours. Often bodies are not tested due to a queue, and toxicity tests can take 6 weeks to get back.

Next up, is the dead person the driver, or the passenger? And, were they in a car causing the accident, or the one being hit? Were they in a car at all? Or was a car simply involved?

Perhaps it’s the good drivers that are getting killed, and drunk/high drivers are ok. This might seem silly, but there is some data to support this. Does this suggest we should all be a little intoxicated if we drive to protect ourselves? (this is not my advice, but it is a line of reasoning we must review, and when we do, the results are not what one might expect). Knowing this reality, I will force my body to instantly relax if I believe I’m about to be impacted. I once did this just before being hit at full speed by a racehorse. Those viewing me noted I became like a rag doll just BEFORE the impact.

It was a horrible impact, but, I did not break a single bone. The rider and the horse each did break their bones. Leaderboard: Me 1, horse 0, rider 0.

So let’s come back, do we know anything from that original ‘52.9%’ statistic? Again, not even slightly.

This site claims the 4th of July is the worst:

‘During the last 10 years, it showed that Independence Day have the highest accident statistics’

Are they saying the 4th is the worst, or that the 4th is the one most other sites talk about? At least we might be able to support the latter.

Where does this number 98.6% come from?

I’ve used this exact number to represent ‘numeric failure’ since I was 9.86 years old. It starts with Dr. Dan, or Danny as I like to call him (Doctor Daniel Gabriel Fahrenheit). He was trying to lock in on a scale for temperature, and very roughly (his method was odd, and his own secret) but he liked 3rds. Most people looking at temperature will note two very big events are when things freeze, and when things boil. And since humans are 100% absorbed with themselves, their own body temperature. So he roughly assumed humans were around 90 or 100 (depending on how you want to think of his scale, and 1/3 of this would be frozen, and 30x6 was boiling. None of this is correct of course, but it was close… enough at the time. Enough to confirm when a human might die. So, give him points here. That was his goal after all. Not making the best measurement system, but making one that was useful.

Once we were able to lock temperature to something more important and stable, we could reset his scale against what we observe of humans (which have a huge normal and healthy range really). This number placed on this 100-based scale is 98.6f, or simply 37c. Even if he wanted it to be 90, or 100.

So should I worry?

………………… No!

Transport accidents are serious, and should be considered. Just not worried about.

Them: Tomorrow is the 4rth, so driving will be especially dangerous because of all the drunks on the road. Better to avoid that.

Me: I assume I’m always driving with drunks.

I’ve already done what I need to do to ensure my safety… enough:

  • I eat healthily every single day
  • I work out every single day
  • I don’t take drugs, smoke, or drink coffee or alcohol, ever
  • I don’t stress about things I can’t control or are statistically unlikely
  • I drive larger than average vehicles (vans/trucks) because mass/density wins.

--

--

Ki

‘Being offended makes people feel important... I want people to feel important.’ - I'm not looking for followers, these articles are for my personal peers.