Build low-latency Vision AI applications using our new open-source Vision AI SDK. ⭐️ on GitHub ->

Moderation Certification Course

Types of moderation rules

This lesson explains what moderation rules are and how they work to protect online communities. It covers AI-powered text, image, and video rules, semantic filters, blocklists, regex filters, and domain controls. By combining these tools, platforms can detect harmful content, reduce false positives, and create safer, more engaging spaces for users.

What Are Moderation Rules?

Within a policy, rules define how the system detects harmful content and what action to take when it finds something. Stream supports a variety of rule types, each designed for different kinds of risks. By combining these rules, you can build a moderation framework tailored to your platform’s needs.

AI-Powered Rules

AI Text

Stream’s AI text rules analyze text content for harmful language.Unlike keyword filters, they use context, tone, intent, and prior messages to tell the difference between harmless banter and real harassment.

They can detect subtle issues that blocklists often miss, such as grooming attempts, self-harm signals, or coded hate speech. The options for defining harms are virtually endless, since they can be written in plain language to reflect your community’s standards.

Once a harm is detected, the system applies an action. You can choose from flag, block, shadowblock, bounce and flag, bounce and block, or use a custom severity scale to fine-tune how the system responds.

We will dive into how to set these harms and what the actions mean in a later lesson.

AI Image Rules

Stream’s AI image rules analyze uploaded images for unsafe or inappropriate visuals. They go beyond simple metadata checks, scanning the content itself for issues like nudity, alcohol, or graphic violence.

This makes them essential for platforms with user-uploaded media, such as avatars, profile pictures, within texts, memes, or community galleries, where harmful images could undermine trust and safety.

You can protect against the following harms:

  • Explicit
  • Non-Explicit Nudity of Intimate parts and Kissing
  • Swimwear or Underwear
  • Violence
  • Visually Distrubing
  • Drugs & Tobacco
  • Alcohol
  • Rude Gestures
  • Gambling
  • Hate Symbols
  • Personal Identifiable Information
  • QR Code

When an image is detected, the system can take actions such as flag, block, shadowblock, or no action.

You additionally have the ability to set a confidence score, which we will dive into later in our course.

In addition to visual detection, Stream offers OCR (Optical Character Recognition) as an add-on feature. OCR allows the system to extract and analyze text embedded inside images, for example, offensive words hidden in memes, screenshots, or user-generated graphics and apply your moderation rules to that text as well.

AI Video Rules
Stream’s AI video rules analyze video uploads for unsafe or inappropriate visuals. Like image moderation, the system scans the actual video frames not just metadata to detect harms.

This capability is especially important for platforms that rely on short-form or user-generated video content, where harmful material can spread quickly and undermine community trust.

You can protect against harms such as:

  • Explicit sexual content
  • Non-explicit nudity (intimate parts, kissing, suggestive activity)
  • Violence and assault
  • Visually disturbing content (blood, gore, graphic injury)
  • Drugs, alcohol, or tobacco use
  • Rude gestures
  • Gambling
  • Hate symbols

Each detection includes a confidence score similar to AI Image.

When harmful video content is detected, the system can take actions such as flag, block, shadowblock, or no action, depending on the policies you’ve defined.

AI Semantic Filters

Semantic filters allow you to go beyond keywords or blocklists by writing plain language rules that the AI interprets for meaning. Instead of matching exact terms, the filter looks at context and intent, enabling detection of subtle harms that traditional filters miss.

For example:

  • “I’m going to kill you” would be flagged as a violent threat, while “This movie kills me” would not.
  • “Let’s meet up after school” might be flagged as grooming if said in a risky context, while “Let’s meet after work” would not.

This contextual awareness helps reduce false positives while catching emerging risks like slang, sarcasm, or coded language.

When a violation is detected, the system can take actions such as flag, block, shadowblock, bounce and flag, bounce and block.

Blocklists & Regex Filters

Blocklists and regex filters let you catch specific patterns or terms that AI might miss, especially useful for compliance, spam enforcement, or evasion detection.

Blocklist Rules
Blocklists are simple, precise filters for exact words or phrases. Stream includes a pre-built list (profanity_en_2020_v1) with over a thousand common profanities, but you can easily create custom lists.

To set one up:

  1. Go to the Blocklists & Regex Filters section in your policy and click Add New.
  2. Choose the Word type of filter.
  3. Enter your list items (e.g., banned terms).
  4. Select the action to take when content matches: Flag, Block, ShadowBlock, Mask and Flag, Bounce and Flag, Bounce and Block.
  5. Save to activate the blocklist.

Notably, these filters use exact, case-insensitive matching. For example, adding dogs, house, and woman will match phrases like “Dogs,” “house,” or “woman,” even with punctuation, but will not catch substrings (e.g., "house" in "lighthouse").

Regex Filters

Regex filters allow pattern-based detection for more flexibility. These are useful to catch variations, obfuscated content (like “fr33 m0n3y”), or spammy structures such as URLs or phone numbers.

Regex tips to keep in mind:

  • Prioritize clarity: complex, cryptic expressions are harder to manage.
  • Use anchors (^, $) for performance optimization and accuracy.
  • Beware of overly broad patterns; balance specificity and generality to reduce false positives.

Domain Blocklists & Allowlists

In addition to word and regex filters, Stream lets you control links by creating domain allowlists and domain blocklists. These rules ensure that only safe domains are allowed in your community, while harmful or unwanted ones are blocked.

Domain Blocklists

A domain blocklist is a list of websites that should never appear in your platform. If a message contains a URL from a blocked domain, the system applies your configured action (e.g., block or flag).

  • Best for:
    • Known spam or phishing domains
    • Adult or gambling websites
    • Competitor or disallowed commercial links
  • Example: Blocking spamdomain.com means that https://spamdomain.com/promo will automatically be flagged.

Domain Allowlists

A domain allowlist works the opposite way. Instead of blocking specific domains, you create a list of approved websites. Any link outside of the allowlist will be blocked or flagged automatically.

  • Best for:
    • Restricting content sharing to a trusted set of domains (e.g., youtube.com, yourcompany.com)
    • Preventing all other external links by default
  • Example: Allowlisting youtube.com and vimeo.com means users can share those links, but a link to any other site will be flagged.

Combining Rules for Stronger Policies

A strong policy usually mixes these rule types:

  • AI Text, Image, and Video for nuanced detection across all content formats.
  • Semantic Filters to catch intent and reduce false positives.
  • Blocklists for hard “do not allow” terms.
  • Regex Filters for spammy or evasive patterns.

This layered approach ensures broad coverage without overwhelming moderators with noise.

Moderation rules are the building blocks of policies. By combining AI detection (across text, images, video, and semantics) with blocklists and regex filters, you create a system that balances speed, context, and precision.

Next, we’ll look at best practices for organizing policies and rules so they scale effectively as your community grows.