Rules

Moderation rules allow you to define automated actions that trigger when certain conditions are met during content moderation. Rules can be scoped to specific configs or applied app-wide. They support conditions based on harm labels, severity levels, and other criteria, with configurable actions and cooldown periods.

Upsert Rule

Create or update a moderation rule. This endpoint uses POST /api/v2/moderation/moderation_rule and allows you to define the rule name, conditions, actions, and other configuration in a single request.

await client.moderation.upsertModerationRule({
  name: "ban_on_severe_harassment",
  config_keys: ["chat:messaging"],
  action: {
    type: "ban",
    ban: { timeout: 1440, reason: "Severe harassment" },
  },
  conditions: [{ label: "HARASSMENT", severity: "critical" }],
  enabled: true,
  description: "Auto-ban users for critical harassment",
});

Request Parameters

keyrequiredtypedescription
nametruestringUnique name for the rule.
config_keysfalsearrayList of config keys this rule applies to. If empty, applies to all configs.
actionfalseobjectThe action to take when conditions are met.
conditionsfalsearrayList of conditions that trigger the rule (label, severity, etc.).
groupsfalsearrayCondition groups for complex logic.
enabledfalsebooleanWhether the rule is active.
descriptionfalsestringHuman-readable description of the rule.
cooldown_periodfalsestringCooldown period between rule triggers (e.g., "1h", "24h").
teamfalsestringTeam identifier for multi-tenancy.

Response

keytypedescription
ruleobjectThe created or updated moderation rule.

Get Rule

Retrieve a single moderation rule by its ID. This endpoint uses GET /api/v2/moderation/moderation_rule/{id} and returns the full rule object including its conditions, actions, and metadata.

await client.moderation.getModerationRule({ id: "rule_id" });

Request Parameters

keyrequiredtypedescription
idtruestringThe unique identifier of the moderation rule.

Response

keytypedescription
ruleobjectThe moderation rule.

Delete Rule

Delete a moderation rule by its ID. This endpoint uses DELETE /api/v2/moderation/moderation_rule/{id} and permanently removes the rule from the system.

await client.moderation.deleteModerationRule({ id: "rule_id" });

Request Parameters

keyrequiredtypedescription
idtruestringThe unique identifier of the moderation rule to delete.

Response

keytypedescription
durationstringRequest duration.

Query Rules

Query moderation rules with optional filters, sorting, and pagination. This endpoint uses POST /api/v2/moderation/moderation_rules and supports standard query patterns for listing and searching rules.

await client.moderation.queryModerationRules({
  filter: { enabled: true },
  sort: [{ field: "created_at", direction: -1 }],
  limit: 10,
});

Request Parameters

keyrequiredtypedescription
filterfalseobjectFilter conditions.
sortfalsearraySort parameters.
limitfalsenumberMaximum rules to return.
nextfalsestringCursor for pagination.

Response

keytypedescription
rulesarrayList of moderation rules.
nextstringNext cursor for pagination.

Rule Actions Reference

The following table lists all available action types that can be assigned to a moderation rule. Each action defines what happens when the rule conditions are met.

action typedescription
flagFlag the content for manual review.
removeAutomatically remove the content.
bounceBounce the message back to the sender (chat only).
banBan the content creator.
shadow_blockShadow-block the content.
customTrigger a custom webhook.

Global vs Config-Specific Rules

Rules can be scoped to apply either globally across your entire application or only to specific moderation configurations.

Global Rules

Global rules apply to all the content in your application, as long as it reaches the moderation system.

{
  "name": "Global Spam Detection",
  "description": "Applies to all configurations",
  "team": "moderation",
  "config_keys": [], // Empty array = global rule
  "id": "global-spam-detection",
  "rule_type": "user",
  "enabled": true,
  "cooldown_period": "24h",
  "conditions": [
    // ... conditions
  ],
  "logic": "AND",
  "action": {
    // ... action definition
  }
}

Config-Specific Rules

Config-specific rules only apply to the moderation configurations you specify. List the configuration keys in the config_keys array.

E.g., if you have a rule that only applies to chat, you can set the config_keys to ["chat:messaging", "chat:support"], then counters will only be counted for messages sent in the channel types mentioned.

{
  "name": "Chat-Only Rule",
  "description": "Only applies to chat configurations",
  "team": "moderation",
  "config_keys": ["chat:messaging", "chat:support"], // Specific configs
  "id": "chat-only-rule",
  "rule_type": "user",
  "enabled": true,
  "cooldown_period": "12h",
  "conditions": [
    // ... conditions
  ],
  "logic": "AND",
  "action": {
    // ... action definition
  }
}

Use Cases:

  • Global Rules: Account-level violations, severe content policies, cross-platform spam detection
  • Config-Specific Rules: Channel-specific rules, different content standards per product area

Basic Rule Structure

Every rule has three main parts:

  1. Conditions: What behavior to watch for
  2. Threshold: How many violations before taking action
  3. Action: What to do when the threshold is reached

Example: Complete Rule Structure

{
  "name": "Spam Detection",
  "description": "Detects and bans users for spam behavior",
  "team": "moderation",
  "config_keys": ["chat:messaging", "chat:support"],
  "id": "spam-detection",
  "rule_type": "user",
  "enabled": true,
  "cooldown_period": "24h",
  "conditions": [
    {
      "type": "text_rule",
      "text_rule_params": {
        "threshold": 5,
        "time_window": "1h",
        "llm_harm_labels": {
          "SCAM": "Fraudulent content, phishing attempts, or deceptive practices",
          "PLATFORM_BYPASS": "Content that attempts to circumvent platform moderation systems"
        }
      }
    },
    {
      "type": "content_count_rule",
      "content_count_rule_params": {
        "threshold": 50,
        "time_window": "1h"
      }
    }
  ],
  "logic": "OR",
  "action": {
    "type": "ban_user",
    "ban_options": {
      "duration": 3600,
      "reason": "Spam behavior detected",
      "shadow_ban": false,
      "ip_ban": false
    }
  }
}

This rule:

  • Watches for spam and advertising content
  • Triggers when a user posts 5+ spam messages within 1 hour or 50+ messages within 1 hour
  • Bans the user for 1 hour when triggered

Key Differences

AspectUser-Type RulesContent-Type Rules
Evaluation TimingTrack over time, trigger when threshold reachedEvaluate immediately per content piece
ThresholdRequired (e.g., 3 violations in 24h)Not applicable (immediate evaluation)
Time WindowRequired (e.g., "24h", "7d")Not applicable
Use CasePattern detection, repeated violationsImmediate content filtering
ActionsUser actions (ban_user, flag user)Content actions (flag content, block_content)

Action Selection Guidelines

  • User-Type Rules: Use user actions (ban_user, flag user) when you want to take action against the user account based on their behavior pattern
  • Content-Type Rules: Use content actions (flag content, block_content) when you want to take action against specific content pieces
  • Call-Type Rules: See Call Moderation for call-specific actions and escalation
  • Mixed Rules: You can use any action type, but consider whether you want to affect the user or just the content

Time Windows

Specify how long to track user behavior (only applicable to user-type rules):

  • "30m": 30 minutes
  • "1h": 1 hour
  • "24h": 24 hours
  • "7d": 7 days
  • "30d": 30 days

Cooldown Periods

The Rule Builder supports cooldown periods to prevent immediate re-triggering of rules after an action has been taken. This is particularly useful when users are banned and then unbanned by administrators.

When a rule with a cooldown period is triggered and an action is taken (like banning a user), the system records this action with an expiration time. During the cooldown period, the same rule will not trigger again for that user, even if they continue to violate the conditions.

Configuration

Add a cooldown_period field to your rule configuration:

{
  "name": "Spam Detection with Cooldown",
  "description": "Spam detection rule with 24h cooldown",
  "team": "moderation",
  "config_keys": [],
  "id": "spam-detection",
  "rule_type": "user",
  "enabled": true,
  "cooldown_period": "24h",
  "conditions": [
    // ... conditions
  ],
  "action": {
    "type": "ban_user",
    "ban_options": {
      "duration": 3600,
      "reason": "Spam behavior detected",
      "shadow_ban": false,
      "ip_ban": false
    }
  }
}

Example Scenario

  1. User violates rule: User posts 5 spam messages in 1 hour
  2. Rule triggers: User gets banned for 1 hour
  3. Admin unbans user: Administrator manually unbans the user
  4. User posts again: User immediately posts more spam messages
  5. Cooldown active: Rule does not trigger again due to 24-hour cooldown
  6. After cooldown: User can trigger the rule again after 24 hours

Use Cases

  • Post-Ban Protection: Prevent immediate re-banning after manual unbans
  • Graduated Response: Give users time to reflect before facing consequences again
  • Administrative Flexibility: Allow admins to override rules without immediate re-triggering

Best Practices

Start Simple

Begin with basic rules and gradually add complexity as you understand your community's needs.

Set Reasonable Thresholds

  • Too low: May catch legitimate users
  • Too high: May miss problematic behavior
  • Start conservative and adjust based on results

Use Appropriate Time Windows

  • Short windows (1-6 hours): Catch immediate abuse
  • Medium windows (24-48 hours): Catch persistent violators
  • Long windows (7-30 days): Catch chronic offenders

Configure Cooldown Periods

  • Short cooldowns (1-6 hours): For minor violations where users should get another chance quickly
  • Medium cooldowns (24-48 hours): For moderate violations where users need time to reflect
  • Long cooldowns (7-30 days): For serious violations where users need significant time before facing consequences again

Test Your Rules

Use the test mode to verify your rules work as expected before enabling them in production.

Monitor Performance

Watch for rules that trigger too frequently or not enough, and adjust accordingly.