Class 7 · CBSE AI · Strand C — NLP, Vision, and LLMs Deep-Dive

What is text classification? Building a topic classifier — Class 7

How AI sorts text into categories — one of the oldest and most useful NLP tasks. For Class 7.

What this concept actually says

  • Topic classification assigns a text to one or more predefined categories based on its content
  • The choice of categories is a design decision with real consequences — it shapes what the system can and cannot represent
  • Multi-label classification allows a text to belong to more than one topic simultaneously

An analogy your child will recognise

Post office sorting

A post office sorter puts each letter into one city's bag. But what if a letter is addressed to someone who has two homes — one in Chennai, one in Delhi? You have to pick one bag, or create a new rule for 'dual destination' letters. Topic classifiers face exactly this problem with multi-topic content.

Mela stall organisation

At a mela, stalls are organised by type — food, games, crafts. But a stall selling handmade food toys (like a craft-food hybrid) doesn't fit neatly anywhere. The person running the mela has to make a decision about where to place it. That placement decision is exactly what a topic classifier does — and it always involves some loss of nuance.

Common misconceptions to watch for

  • More categories always means a better classifier — fine-grained categories require exponentially more labelled data and increase the chance of confusion between similar classes.
  • A topic classifier 'understands' what an article is about — it recognises patterns of words associated with labels in training data, which is not the same as comprehension.

Key facts in one breath

  • Topic classification is one of the oldest NLP tasks, with early systems dating to the 1960s using simple keyword matching.
  • Modern classifiers fine-tune large pre-trained models on domain-specific labelled data rather than building from scratch.
  • Zero-shot classifiers can assign topics they were never explicitly trained on, by leveraging embedding similarity to topic descriptions.
  • The choice of label taxonomy (the set of categories) is a sociotechnical decision — it encodes assumptions about how the world should be organised.

How Dhee Learning teaches this — the 3-stage question loop

Every Dhee Learning session for this concept follows three stages. We share the questions Dhee actually asks, so you can hear what a session sounds like.

Stage 1 — Surface

A news app wants to tag every article automatically as 'Sports', 'Politics', 'Technology', or 'Entertainment'. What happens when an article is about a cricket player who becomes a politician and launches a sports app?

Rote answer

"A topic classifier puts text into categories."

Understood

"That article fits all three categories at once, which breaks a system that forces one label. You'd either need multiple labels per article, or you'd have to accept that whichever single label you pick, you're losing important information."

Stage 2 — Reasoning

Two classifiers are trained on the same news dataset. Classifier A has 5 topic categories; Classifier B has 50. What are the trade-offs — when would you prefer A, and when would you prefer B?

Follow-up Dhee may use: What if two of the 50 categories are nearly identical — like 'Cricket' and 'IPL'? What problem does that create for the model?

Stage 3 — Application

You're building a classifier to route student questions to the right subject teacher in a school chatbot. List the three hardest design decisions you face before collecting any data.

Misconception Dhee watches for: Assuming the category list is obvious and fixed — in practice, defining categories is where most real-world classification projects spend the most time.

Related concepts

Want your child to actually understand this?

Dhee turns this concept into a 15-minute spoken session — asking, listening, and probing — so your child builds the idea themselves.

Frequently asked questions

What is building a topic classifier — explained for kids? +

How AI sorts text into categories — one of the oldest and most useful NLP tasks. For Class 7.

What's the most common mistake children make about this concept? +

More categories always means a better classifier — fine-grained categories require exponentially more labelled data and increase the chance of confusion between similar classes.

How does Dhee Learning teach this in a Class 7 session? +

Dhee opens with a question — for example: "A news app wants to tag every article automatically as 'Sports', 'Politics', 'Technology', or 'Entertainment'. What happens when an article is about a cricket player who becomes a politician and launches a sports app?" — listens to your child's answer, then probes the reasoning behind it. The session ends when the child can apply the idea to a brand-new situation, not just recall it.