What is overfitting — when ai memorises instead of learns — explained for kids?

Why an AI that aces practice questions can still fail in the real world.

What's the most common mistake children make about this concept?

100% training accuracy means a great model — it is actually a red flag for overfitting in most cases.

How does Dhee teach this in a Class 6 session?

Dhee opens with a question — for example: "A student memorises every answer from last year's question paper word for word. In the exam, slightly different questions are asked. What happens — and why?" — listens to your child's answer, then probes the reasoning behind it. The session ends when the child can apply the idea to a brand-new situation, not just recall it.

Overfitting — when AI memorises instead of learns

What this concept actually says

Overfitting is when a model learns the training data too perfectly, including its noise and quirks
An overfit model scores high on training data but poorly on new, unseen data
Overfitting is like memorising answers without understanding the subject

An analogy your child will recognise

Rote learning for exams

If you memorise 'The French Revolution began in 1789 because of economic crisis and social inequality' perfectly but don't understand what 'economic crisis' means, you'll answer that question flawlessly and fail any question that asks you to explain or apply the idea. Overfitting is AI doing exactly this — perfect recall, zero generalisation.

Auto-rickshaw route memorisation

An auto driver who has only driven one specific route every day memorises every pothole and shortcut. Put him on a new route and he's lost. An overfit model is this driver — expert on the training route, helpless on any new road.

Common misconceptions to watch for

100% training accuracy means a great model — it is actually a red flag for overfitting in most cases.
Overfitting is rare in practice — it is one of the most common problems practitioners face, especially with small or narrow datasets.

Key facts in one breath

The gap between training accuracy and test accuracy is the clearest signal of overfitting — the larger the gap, the worse the overfit.
Overfitting is more likely with small datasets, very complex models, or very long training.
Techniques to fight overfitting include: using more data, adding dropout (randomly switching off neurons during training), and early stopping.
Regularisation is a mathematical technique that penalises overly complex models during training to discourage memorisation.

How Dhee teaches this — the 3-stage Socratic loop

Every Dhee session for this concept follows three stages. We share the questions Dhee actually asks, so you can hear what a session sounds like.

Stage 1 — Surface

A student memorises every answer from last year's question paper word for word. In the exam, slightly different questions are asked. What happens — and why?

Rote answer

"Overfitting is when the model learns too much from training data."

Understood

"The student stored the exact words, not the idea behind them. When the question changes even a little, there's no match in memory and they're stuck. An overfit AI does exactly this — it stored the specific training examples so precisely that any variation throws it completely."

Stage 2 — Reasoning

An AI trained to detect spam email gets 100% accuracy on training emails but only 60% on new emails. What do you think went wrong during training?

Follow-up Dhee may use: If memorising causes overfitting, what would the opposite problem be — learning too little? What might that look like?

Stage 3 — Application

You're training an AI to classify if a school essay is well-written. After training, it gives top marks to any essay that uses the word 'furthermore' — because most good essays in its training data happened to use it. Is this overfitting? Explain your reasoning.

Misconception Dhee watches for: Thinking overfitting only happens when accuracy is low — an overfit model has very high training accuracy; the problem shows up on new data.