Twitter Content Moderation: Implementing Safety on X

April 9, 2024 | 9 min read

X, formerly Twitter, works hard to uphold all its users’ freedom of speech and expression. At the same time, it also prioritizes the safety of all users on the platform. The company’s administrators use Twitter content moderation to balance both features on the platform.

However, despite Twitter policies and safety rules, many users still report misinformation and hate speech on the platform. So, you may wonder about the current effectiveness of the platform’s content moderation. Therefore, this article explains how content moderation works on X and how you can help.

Table of Contents

How Does Twitter Moderate Content?

Twitter moderates content on the X platform through visibility filtering. That means restricting the reach of inappropriate content on the platform. This approach is the best way to balance freedom of expression and safety on the platform.

With visibility filtering, Twitter allows anyone to post anything on the platform. However, they restrict the audience a post can reach when flagged as inappropriate. When X restricts the reach of your post, it may not appear on home timelines and Twitter searches.

This method is a novel approach with intensified application since Elon Musk acquired Twitter. As an advocate of free speech, Musk desires to give every user unrestricted freedom of speech.

Previously, X content moderation followed a defined process that usually led to removing inappropriate tweets from the platform. It may also cause account suspensions, depending on the severity of the safety policy violation.

That was the case when Twitter primarily used human employees as content moderators. At that time, any Twitter user could contact X administrators via numerous channels to report an inappropriate tweet or broadcast. Then, the moderators would review the content and pull it down when deemed inappropriate.

Now, however, Twitter employs artificial intelligence (AI) for content moderation. X designed this tool to detect inappropriate content and restrict their reach on the platform. This design aims to reduce the potential harm caused by abusive content circulating the platform.

Further, Twitter uses this automation tool to immediately take down tweets reported by trusted figures, especially concerning child safety. They also restrict hashtags and search results for harmful topics like abuse.

Thus, this section has shown how Twitter moderated content in the past and how it does so today.

How Effective Is AI for X Content Moderation?

Many veteran tweeps rate AI as poor and inefficient for content moderation on Twitter. Theoretically, Twitter’s AI ought to autonomously detect and moderate content users post on the platform. However, it depends heavily on human control, making it inefficient in application, especially with X’s reduced workforce.

A major reason for this inefficiency is that Twitter employs the machine learning AI model for content moderation. Despite human oversight, this model registers a massive error rate—up to 40% —in detecting and moderating inappropriate content.

Another reason for its inefficiency is the constant need for training and updates. To remain efficient, administrators must frequently update the AI’s algorithm to discern new misinformation trends and toxic content.

This AI model will quickly lose competence and become obsolete without this continuous update. Currently, Twitter seems incapable of satisfying this need for continuous updates with its declining workforce in the health engineering unit.

Thus, it is only reasonable to conclude that using AI for content moderation on X is ineffective. Little wonder there is an influx of political misinformation, hate, and toxic content on the platform today.

The algorithm can only restrict the reach of content it considers inappropriate. But that is impossible without an efficient capacity to detect inappropriate content.

The acronym “AI” is spelled in blue severally on a yellow background.

How You Can Help Twitter Moderate Content on X

The rate of misinformation and hate messages on Twitter is increasing at an alarming rate. Consequently, many conclude that Twitter only has written safety policies but does not work to implement them.

It is so bad that some tweeps delete their Twitter account and leave the platform for good. Although seemingly poor and declining, it is undeniable that Twitter is making efforts to maintain safety through content moderation. As a healthy user, knowing that you can also help the platform moderate content may thrill you.

During the era of whistleblowing on X, there was a more functional content moderation ecosystem on the platform. Users only needed to contact Twitter’s support on the phone to report inappropriate tweets. A user could also report broadcasts or serve as a moderator for broadcasts.

Without these features available, how can you help? You can help in even more practical ways today. For example, you can report inappropriate tweets from the overflow icon on each post.

That will help administrators easily identify such tweets and moderate them as necessary. It also helps them to train their machine for autonomous detection of inappropriate tweets.

Moreover, you may become a trusted figure for accurate reports after some time. Then, Twitter’s algorithm will immediately act on your reports without reviews from human moderators.

You can further help moderate inappropriate posts by reducing their circulation on the platform. That means you would not share nor engage with such posts, not even commenting on them. Remember, the more you engage a post, the more Twitter’s algorithm shares it with others.

A Twitter user scrolls through the X timeline on a mobile phone.

How To Moderate Content That Appears on Your Timeline

You can only do so much to help Twitter moderate content on the X platform. The bulk of the work depends on the social media giant. However, you can do so much more on your timeline.

Twitter allows you to control what you see and consume on the platform. Using Twitter settings, you can control what appears on your X timeline and searches to refine your X experience.

Therefore, this section shows how to change settings on Twitter to help you moderate your X content.

1. Mute Words on Twitter To Moderate Your Timeline Content

Your followers determine the bulk of what you see on Twitter. Following an account subscribes you to the person’s tweets, so their tweets will appear on your timeline. However, to expand your X experience, Twitter provides tweets from other users on your feed, even those you don’t follow.

As a result, this may expose you to content that you do not want to see on the platform. Nevertheless, you can moderate your feed by muting words on your timeline to prevent this. When you mute a word, phrase, or hashtag, Twitter removes tweets containing those words from your timeline. Even if such tweets trend, you’ll not see them on your timeline.

This section will show you how to mute words on Twitter to moderate your content. It will also highlight some rules guiding the mute feature.

Below are the six steps for muting a word on X:

Open the X navigation menu from your profile icon at the top left of your homepage.
Click Settings and Support and tap Settings and Privacy.
Choose Privacy and Safety and open Mute and Block.
Select Muted Words and tap the Add Muted Words icon at the bottom right.
Type the word or phrase you want to mute in the field provided above.
Click Save in black at the top right corner of the page.

This simple process mutes and blocks a word from your timeline. It filters your feed from showing a tweet with that word or phrase.

When muting words, always remember:
Muting a word also mutes the hashtag of that word.
You can only mute 200 words or phrases per account.
Muted words are case-insensitive.
You cannot edit a muted word after adding it to your list.

2. Turn on the Sensitive Content Warning for Twitter Content Moderation

Not all inappropriate posts are written tweets. In fact, the most harmful posts circulated on a social media platform are media content. Even after muting words on your timeline, such content may appear if their captions do not contain the muted word.

Luckily, Twitter has another feature that can protect you from such harmful content—the sensitive content warning. This warning appears over pictures and videos containing sensitive content to cover them and warn you. The warning disappears from the content if you desire to view them.

This warning appears over media posts containing violent or immoral content. Twitter allows this setting for all accounts by default. If yours is off, you’ll learn how to turn on the sensitive content settings for Twitter here.

Follow these four steps below to change the sensitive content setting for your account:

Slide your X homepage from left to right to reveal the navigation menu.
Open Settings and Support, then Settings and Privacy.
Click Privacy and Safety and select Content You See.
Move the slider beside the Display media that may contain sensitive content option to turn it off.

A graphic design signage reads, “We respect your privacy.”

Exploring the Twitter New Privacy Policy

Beginning on September 29, 2023, Twitter effected a new privacy policy. Like all policy documents, the new policy is a long and tiring read. However, this section outlines the most important features of Twitter’s new privacy policy for you.

First, the policy hints at collecting more metadata related to your encrypted messages. Although collecting such data is inevitable when running such a platform, companies try to collect less, not more.

More worrying is the fact that they intend to share this information with their partners to improve targeted advertisement. That is worrisome because you don’t know how much data they collect from you and what they share with others!

Further, the policy also states that Twitter will begin collecting biometric and personal information about your job and employment history. They claim to need biometric information to improve your account security and the latter for personalized service, including job ads.

However, this new policy only applies to Twitter premium subscribers. There has yet to be any mention of whether it will affect other users in due course.

Thus far, there is no new Twitter content moderation policy. So, you’d benefit from managing your account to moderate your content. That includes the content you see and those you post on the platform.

Use TweetEraser to bulk-delete tweets you find inappropriate on your timeline. This Twitter management tool allows you to mass-delete tweets or even erase your timeline seamlessly. The algorithm is efficient and has no negative effect on your account. It leaves your timeline fresh and attractive after application. So, start filtering and cleaning your Twitter timeline today!

Bulk delete past tweets with one click