Class 7 · CBSE AI · Strand A — Systems Thinking
How content moderation AI works — false positives and negatives
Moderation systems must balance blocking good content against allowing harm. The hard trade-off. For Class 7.
Class 7 · CBSE AI · Strand A — Systems Thinking
Moderation systems must balance blocking good content against allowing harm. The hard trade-off. For Class 7.
School discipline system
A school that suspends every student who uses a word on its 'banned list' — without any human judgement — will punish students discussing literature, science terms, or reporting bullying they experienced. The rule is a blunt instrument where context requires nuance. Content moderation AI faces exactly this problem at millions of posts per hour.
Customs officer at an airport
A customs officer uses rules to flag suspicious bags. If the rules are too strict, every family with a home-cooked lunch gets stopped. If too loose, prohibited items walk through. The officer uses judgement to balance the two errors — and has an escalation path to a senior officer for hard cases. Content moderation needs the same layered structure: rules, judgement, escalation.
Every Dhee Learning session for this concept follows three stages. We share the questions Dhee actually asks, so you can hear what a session sounds like.
Stage 1 — Surface
If you were in charge of moderating comments on a news website, and you had to use an AI to help — what are two types of mistakes the AI could make, and which mistake would you worry about more?
Rote answer
"It might block good comments or allow bad ones."
Understood
"A false positive blocks legitimate comments — maybe a medical discussion that uses clinical terms the AI flags as inappropriate. A false negative allows genuinely harmful content through. Both are costly: false positives silence people; false negatives cause harm. Which is worse depends on the platform's values and context — there's no universal answer, and choosing a threshold is itself a values decision."
Stage 2 — Reasoning
An AI content moderator is trained on text from English-language social media to detect hate speech. It is then deployed on a platform used heavily by people writing in Hinglish (Hindi + English mix) and regional languages. What goes wrong and why?
Follow-up Dhee may use: Who has the power to fix this problem, and who bears the harm while the problem persists?
Stage 3 — Application
You are designing a content moderation system for a children's learning platform in India used in three languages. Design the system architecture: specify the AI component, the human review component, the appeals process, and identify two specific edge cases where the system is most likely to fail and what you would do about each.
Misconception Dhee watches for: Child designs a system with no appeals process — in any content moderation system, appeals are essential because AI and even human reviewers make errors regularly.
Dhee turns this concept into a 15-minute spoken session — asking, listening, and probing — so your child builds the idea themselves.
Moderation systems must balance blocking good content against allowing harm. The hard trade-off. For Class 7.
A more accurate AI will eventually solve the content moderation problem — in reality, context, culture, and language complexity create a permanent floor of difficulty.
Dhee opens with a question — for example: "If you were in charge of moderating comments on a news website, and you had to use an AI to help — what are two types of mistakes the AI could make, and which mistake would you worry about more?" — listens to your child's answer, then probes the reasoning behind it. The session ends when the child can apply the idea to a brand-new situation, not just recall it.