Image moderation - What is it and how does it work?

An unprecedented amount of visual content is viewed and shared online daily, from benign images of cats and puppies to graphic images unfit for most viewers.

Image moderation is essential for protecting individuals and communities online, whether through human effort or AI moderation tools.

What Is Image Moderation?

Image moderation is a form of content moderation that reviews images to ensure they are free from offensive, harmful, inappropriate, or illegal content.

Most modern forms of image moderation rely heavily on AI, which automatically analyzes the images. This is followed by a review step in which images may be flagged for potential violations or removed entirely.

Image moderation is important for the following reasons:

User experience: Image moderation helps create a safe and open environment where users feel comfortable interacting with your brand and other users.
Brand reputation: By weeding out potentially offensive visual content, businesses can preserve brand reputation and maintain user trust.
Legal protection: Image moderation can protect you from potential legal troubles by keeping user-generated content on your platform compliant with community and regulatory standards.

How Does Image Moderation Work?

Popular apps can have millions of users, making it unfeasible for human moderators to keep these platforms safe. This is why modern image moderation for large-scale social media and other communities primarily uses AI.

Some of the technical approaches developers employ for AI-based image moderation include:

Automated Content Assessment

Automated content assessment uses machine learning (ML) to make reliable predictions about individual pictures by comparing them against a huge store of labeled image data.

This technique works in real-time by human standards, though the actual processing is done asynchronously, with moderation results delivered only after the system completes the assessment.

Image moderation engines include many supported categories of content that they review for. They use probabilities called "confidence scores" to determine which images to mark as inappropriate content.

Image Labeling and Categorization

Image labeling is a term used to describe the AI-driven process that analyzes and classifies images into the categories specified in your system, such as nudity, profanity, crime, or drugs.

Before AI can label images, it must first use ML computer vision algorithms for graphical analysis, which will detect the individual objects inside an image.

For example, if the AI recognizes that an image contains drug paraphernalia, it may be marked as crime, drugs, or both.

Optical Character Recognition (OCR)

OCR is a special use case within the broader field of image moderation. It refers to the ability of AI models to recognize and interpret text in images.

OCR is challenging because text is not always clear in photos. It can be small, blurry, handwritten, misspelled, or in a foreign language. All of this complicates automatic text detection.

Nevertheless, AI models greatly aid our ability to screen for and identify problematic text content in pictures. OCR sometimes even outperforms human eyes for detecting small or blurry text.

OCR can also protect users' or brands' personal information, which might sometimes accidentally be present in images.

When To Use Image Moderation

Image moderation is in all digital spaces where users generate and share images.

Like other content moderation techniques, a successful image moderation approach requires you to first determine what content will be deemed unacceptable within the context of your platform. Then, you have to come up with a logical system for detecting and assessing that content and ultimately decide how to flag or penalize content that's outside of your set boundaries.

Detection of Unsafe/Explicit Content

Identifying explicit and unsafe content on your platform is one of the key ways to create a feeling of trust and safety for users.

As discussed above, image moderation platforms use labeling to categorize inappropriate content found in photos. Going deeper into this categorization, most systems use tiered labeling, with top-level labels (the general category) that contain multiple second-level (more specific) labels.

Below is a list of some of the most common content types filtered for using image moderation with top-level and second-level labels:

Nudity: Graphic male nudity, graphic female nudity, sexual activity, illustrated nudity or sexual activity, and adult toys.
Profanity: Explicit sexual language, derogatory language, hate speech, mild profanity, obscene gestures, and religious profanity.
Crime: Drug use, drug paraphernalia, theft, vandalism, hate symbols, underage criminal activity, depictions of torture or abuse, and promotion of illegal activities.
Violence: Graphic violence or gore, physical violence, weapon violence, weapons, and self-injury.

Screening and Matching

Screening and matching is a special technique that compares user-submitted images against a predefined database of flagged images. It's particularly useful in scenarios where speed and efficiency are needed.

This technique often uses a unique digital fingerprint (hash) to make quick matches. It's well suited for typical moderation tasks, like screening for offensive and harmful content.

However, it's also useful for several custom uses that fall outside this scope, including:

Detecting characters/text
Copyrighted materials
Facial recognition for purposes like age verification
Matching against custom lists
Screening for AI images
Detecting low-quality images
Spotting duplicates

Industry-Specific Use Cases

Some industry-specific use cases include gaming platforms, e-commerce websites, and media- and content-sharing services.

Online gaming platforms with in-game chat and other communication methods (like Discord) are very popular.

In the best case, this real-time communication augments the playing experience, allowing users to create a sense of community, make new friends, and possibly even learn about interesting non-game topics. However, gaming also happens to be one of the most toxic online environments, with adrenaline, competitiveness, and anonymity all adding to the mix.

With most chat platforms allowing users to share personal images, the need for chat moderation of image content (in addition to text and speech) is clear.

For marketplaces, moderation is extremely important for safeguarding brand reputation. Companies go out of their way to protect the safety and comfort of their users to create an enjoyable, consistent shopping experience.

Unacceptable content that enters product listings and user-generated reviews may be automatically detected and removed using AI. This can include anything from offensive images to counterfeit products from other vendors or outright illegal content.

Across many social media, podcast-hosting, and other media-sharing platforms, users can upload files to share with their friends and followers. Images make up a large percentage of this online media, both as the primary content and in the form of comments. As in all the examples above, there is ample potential for inappropriate content to leak in. AI-based image moderation can filter images in all these contexts.

Automated Moderation vs Hybrid Moderation

As with other types of content moderation, the earliest forms centered entirely around human content moderators. However, as online content multiplies exponentially each year, humans cannot be expected to moderate at scale. This is where fully automated and hybrid moderation step in.

Fully Automated Moderation

Fully automated moderation involves systems that rely exclusively on AI for image moderation. The benefits of AI for content moderation are huge due to its ability to process large amounts of data in real time.

Fully automated image moderation typically involves a mix of techniques, like those mentioned earlier. For example, the value of an automated assessment's confidence score can determine whether a user should face consequences if the system blocks their image.

In many cases, automated systems can be more accurate and free of bias than human moderators. However, the downside of AI systems is that they can sometimes misinterpret nuance and may carry their own biases based on their training sets. This is why certain platforms opt to use hybrid moderation approaches instead.

Hybrid Moderation

Hybrid moderation employs both AI- and human-based moderation to blend the benefits of both models.

Some hybrid systems will direct certain moderation tasks to humans that fall above or below a specified confidence score threshold. In other cases, user-submitted flags might go directly to human moderators, or users might have the option to escalate a complaint directly to a person rather than an automated system.

In hybrid systems, AI's benefits go beyond simply taking work off humans' plates—AI also helps moderators by protecting them from being subjected to extraneous amounts of harmful content, such as gore or child sexual abuse material (CSAM).

Frequently Asked Questions

What Is an Image Moderation API?

Image moderation APIs allow developers to integrate moderation features into their applications without building them in-house. Some developers may use a more general content moderation API covering images. Common features include LLM agents for real-time moderation, custom rules, and moderation stat logging.

Are Google Images Moderated?

Yes, Google Images are moderated. Google uses a hybrid image moderation approach, combining automated tools with human moderators.

Google designed its image analysis algorithms to be as flexible as possible in an attempt to understand more nuanced contexts and meanings.

The SafeSearch feature allows users to control the level of content filtering applied to their search results.

What Is Graphic Moderation?

Graphic moderation is a closely related term to image moderation and likewise includes automated and human moderation approaches.

It’s a slightly broader term, encompassing additional visual media formats such as video. Beyond this slight difference, the techniques employed, benefits, and potential issues are practically identical.