Reinventing the Wheel

A series of articles deconstructing Data Science methods and building them from first principles using Python.

“Not because it is easy, but because we don’t know how to do it”

A generation of people grew up convinced that the Stone Age looked something like The Flintstones: cave dwellers commuting in foot-powered cars, dinosaurs serving as household appliances, and wheels everywhere. Crude, heavy, but everywhere. It is a surprisingly durable image. Growing up, we then discovered that this image is also almost entirely wrong. For a start, despite the name, most prehistoric humans did not actually live in caves; caves were occasional shelter, not real estate. Human beings and dinosaurs missed each other by about 65 million years, which is the kind of scheduling conflict that tends to be permanent. And the wheel, that symbol of prehistoric obviousness, turns out to be a surprisingly recent and rare invention, one that several advanced civilisations never developed at all.

Which means that when our parents, teachers, and senior colleagues told us “not to reinvent the wheel”, they were, without knowing it, telling us not to reinvent something that took humanity an embarrassingly long time to invent in the first place.

Here is a challenge. Try to describe, in precise terms, how you would build a functional wheel from raw materials. Not order one, not assemble a flat-pack version, but design and build one that works. The axle must rotate freely. The wheel must bear load. The materials must be available and durable. If that turns out to be harder than expected, you are in good company. The wheel emerged independently in very few places in human history, required a specific combination of materials, terrain and social organisation, and took millennia to spread.

A physicist would call this a boundary condition problem: the wheel is not intrinsically useful, it is useful given the right constraints. Far from being the emblem of simplicity the idiom implies, it is one of the most consequential and quietly complex innovations ever made.

The same dynamic applies to the tools we use every day in data science. We reach for libraries, models and methods that work (and they do work) without necessarily understanding what they are doing, where they can fail, or what their outputs actually mean. That gap rarely matters, until it does. And it matters more now than it ever has. The technologies reshaping industries, like large language models, recommendation systems, automated decision tools, are built on the same statistical foundations covered in this project. Understanding those foundations does not make you an AI researcher, but it gives you something increasingly rare: the ability to ask the right questions about the outputs these systems produce, the confidence intervals nobody reports, and the assumptions buried in the methodology.

This series is a record of building those tools from scratch: probability, statistics, hypothesis testing, measurement and uncertainty, implemented in Python and tested against real and simulated data. The goal is not to replace existing tools but to understand them well enough to use them deliberately and to recognise when something has gone wrong. The topics are not abstract: they appear in the odds quoted by a bookmaker, the football statistics cited by a pundit, and the headlines generated from a survey.

As for the wheel: Fred Flintstone, to his credit, never claimed to understand it. He just put his feet down and pushed. The real thing took millennia, specific terrain, the right animals, and a combination of conditions most civilisations simply did not have. It only became useful once the conditions were right and only worth building once you understood why.

Statistical tools and scientific methods are no different. And neither, it turns out, is the instinct to take something apart just to understand what makes it go.