Class 7 · CBSE AI · Strand B — Python for AI

Your first machine learning model with scikit-learn — Class 7

Create → fit → predict: how scikit-learn trains your first ML model in a few lines. For Class 7.

What this concept actually says

  • scikit-learn provides a consistent interface: create model → fit on training data → predict on new data
  • The train/test split is the foundational discipline that prevents you from fooling yourself about how good your model is
  • A model is a function: it takes features (input columns) and predicts a target (output column)

An analogy your child will recognise

UPSC exam preparation

A student who only practises last year's actual exam papers and then is tested on the same papers will score 100%. But that proves nothing about whether they can handle a new paper. The train/test split is like keeping one practice paper sealed until the actual test day.

Teaching a child to recognise mangoes

You show a child 80 mangoes of different sizes and colours and they learn what a mango looks like. Then you test them on 20 new mangoes they have never seen. That test result is the real measure of whether they learned the concept or just memorised your specific examples.

Common misconceptions to watch for

  • More complex models are always better — in reality, a simple model that generalises well beats a complex model that only memorises training data
  • Once you have trained a model, the work is done — in reality, evaluation, iteration, and monitoring are where most of the real ML work lives

Key facts in one breath

  • scikit-learn's consistent API — fit(), predict(), score() — works across nearly all its algorithms, making it easy to swap models
  • train_test_split() should be called before any data exploration to prevent 'data leakage'
  • A Decision Tree is one of the most interpretable models — you can visualise exactly what decision it makes at each node
  • The default test size in train_test_split is 25%; 80/20 and 70/30 splits are also common conventions

How Dhee Learning teaches this — the 3-stage question loop

Every Dhee Learning session for this concept follows three stages. We share the questions Dhee actually asks, so you can hear what a session sounds like.

Stage 1 — Surface

You have trained a model to predict whether a student will pass. You test it on the same data you trained it on and it gets 98% accuracy. Your friend says 'amazing!' Why should you be suspicious?

Rote answer

"Because it might be overfitting"

Understood

"The model has seen all those examples before — it may have just memorised the answers rather than learned a pattern. The only fair test is data it has never seen, which is why we hold back a test set before training begins."

Stage 2 — Reasoning

In scikit-learn, you call model.fit(X_train, y_train) and then model.predict(X_test). What exactly is happening at each step — what is the model 'doing' mathematically in simple terms?

Follow-up Dhee may use: If you add more features to X_train, will the model always get better? Why or why not?

Stage 3 — Application

Using a simple dataset (e.g. iris or a student grades CSV), write a complete scikit-learn pipeline: load data, split it 80/20, train a Decision Tree, and print predictions for the test set. What is one thing that surprised you about the output?

Misconception Dhee watches for: Treating the model as a black box that magically produces answers, without any mental model of what 'fitting' means

Related concepts

Want your child to actually understand this?

Dhee turns this concept into a 15-minute spoken session — asking, listening, and probing — so your child builds the idea themselves.

Frequently asked questions

What is your first scikit-learn model — explained for kids? +

Create → fit → predict: how scikit-learn trains your first ML model in a few lines. For Class 7.

What's the most common mistake children make about this concept? +

More complex models are always better — in reality, a simple model that generalises well beats a complex model that only memorises training data

How does Dhee Learning teach this in a Class 7 session? +

Dhee opens with a question — for example: "You have trained a model to predict whether a student will pass. You test it on the same data you trained it on and it gets 98% accuracy. Your friend says 'amazing!' Why should you be suspicious?" — listens to your child's answer, then probes the reasoning behind it. The session ends when the child can apply the idea to a brand-new situation, not just recall it.