Don’t Feed the Trolls: Battling Online Harassment with AI

Everybody has a voice on the internet. From the comment threads in online news sites to social media feeds, anybody with an internet connection can participate. The internet facilitates discussion and conversation, providing an anonymity that is absent in face-to-face interaction. As a platform for productive and engaging material, it also houses the nameless and faceless who respond to content with forked tongues. Here, promoting free speech jars with curbing hate sentiments.

Developers are increasingly turning to Artificial Intelligence (AI) and machine learning in an effort to make the internet a more positive and caring environment. Facebook announced March 1 that it is testing out a set of Suicide Prevention Tools. These include using AI to identify posts that are very likely to include thoughts of suicide. On the same day, Twitter announced its use of AI to identify abusive accounts that violate its terms of service.


“You’re Toxic!” (Britney Spears, 2003, Perspective, 2017)

In a bid to tackle online harassment and bullying, Google off-shoot Jigsaw, opened up their API Perspective, two weeks ago, to developers who want to incorporate it on their own websites.  Perspective uses machine learning to detect offensive messages based on their similarity to comments judged by human reviewers as “toxic”. The demonstration website for Perspective explains:

This model was trained by asking people to rate internet comments on a scale from “Very toxic” to “Very healthy” contribution. Toxic is defined as… “a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion.”

In total, Perspective has absorbed the toxicity results of several thousand human-rated comments that form the basis of its current form. Anybody can test out the API by typing an English phrase into the interface and instantaneously receive the toxicity score. Jigsaw hopes to branch out to other languages in the next year. As more sites use Perspective, it will be exposed to more comments. This will, in turn, allow it to further understand the toxicity of certain comments over others.

Screenshot of Perspective (March 5, 2017)


How does Perspective Benefit the Online Community?

Human moderation of comment threads and social media posts is expensive, time-consuming, as well as psychologically intensive. Internet content reviewers sift through depraved media and messages every day to keep them from the eyes of the public. On news sites and forums, a blacklist of offensive words can filter out abusive messages. At The New York Times, a full-time 24/7 team read and review up to 11,000 comments every day. President of Jigsaw, Jared Cohen, explains, “We’ve worked together to train models that allow Times moderators to sort through comments more quickly”. Wikipedia, The Economist and The Guardian are other entities also partnered with Jigsaw’s online harassment API, in a bid to streamline and improve their comment moderation.


“It sounds great, but….”

As with all AI technology, it is iterative, improving through repetition and new information. There are blind spots and issues to iron out. Researchers at the University of Washington have found that a subtle modification to a highly toxic phrase can render it with a lower toxicity score.

Hossein, Kannan, Zhang and Poovendran (2017)

Adding punctuation between letters or misspelling words defies Perspective’s toxic analyser. As a result, it assigns a less toxic rating to the phrase than a manual reviewer would. Conversely, Perspective at present also wrongly assigns highly toxic scores to a neutral phrase, creating a false positive. For example:

“It’s stupid and wrong” (89%)

“It’s not stupid and wrong” (83%)

(Hossein, Kannan, Zhang and Poovendran 2017)

Perspective’s algorithms fail to understand the semantic meaning of the sentence, flagging up the occurrence of toxic words. Jigsaw acknowledges that Perspective is still far from perfect, and feedback from users on the toxicity of certain phrases will be used to improve the model.


“There was truth and there was untruth”: Keeping a Lid on AI Monitoring

Putting the technical teething to one side, incorporating AI into online conversations and comment threads will be an effective way of limiting cruel and hurtful comments. Whilst we want to monitor this, at what point does Perspective become full-blown censorship? Allum Bokhari, on right-wing outlet Breitbart sees Perspective as a “censorship bot”. This is a bit of a premature opinion. At present, we have millions of human reviewers worldwide monitoring internet content. A website owner likely moderates the comments to their perception of acceptable speech, with an unintentional bias. Indeed we should embrace this sort of AI technology as a silencer to poisonous voices, personal intimidation and bullying behaviour. Reaffirming these simple community norms should pose no issue for a normal web user. However, the line should be drawn there.

By allowing different beliefs and opinions to be visible online, also allows a response from the opposing party. We are not all homogenous (thank goodness) in our opinions. Suppression of one party’s voice rather than drawing attention to it kills progression. It is through debate and discussion that us, our governments and our countries’ grow and learn. Still, there is a complex relationship between monitoring hate speech and freedom of expression. When AI censors posts because they don’t conform to a “normative” set of beliefs and opinions, it is time to disconnect and switch off.

– Sarah Maclean-Morris

Leave a Reply

Your email address will not be published. Required fields are marked *