Published in AI

Facebook admits that AI can’t stop hate speech

by on18 October 2021


One in every 2000 posts still contain it


Social notworking site Facebook’s Guy Rosen, who has the unfortunate title of head of integrity, admitted that one in every 2,000 content views on Facebook still contained hate speech from April to June of this year.

Rosen said that the figure was better than mid-2020 when one in every 1,000 content views on Facebook were hate speech. But, it is still pretty grim when executives insisted that artificial intelligence software would address the company’s chronic problems keeping what it deems hate speech and excessive violence as well as underage users off its platforms.

According to the Wall Street Journal, Facebook’s AI can’t consistently identify first-person shooting videos, racist rants and even, in one notable episode that puzzled internal researchers for weeks, the difference between cockfighting and car crashes.

On hate speech, the documents show, Facebook employees have estimated the company removes only a sliver of the posts that violate its rules — a low-single-digit percent, they say. Moreover, when Facebook’s algorithms aren’t confident enough that content violates the rules to delete it, the platform shows that material to users less often — but the accounts that posted the material go unpunished.

The employees analysed Facebook’s success at enforcing its own rules on content that it spells out internally and in public documents like its community standards.

Facebook two years ago cut the time human reviewers focused on hate-speech complaints from users and made other tweaks that reduced the overall number of complaints. Unfortunately, that made the company more dependent on its rules’ AI enforcement and inflated the technology’s apparent success in its public statistics.

According to the documents, those responsible for keeping the platform free from content Facebook deems offensive or dangerous acknowledge that the company is nowhere close to reliably screen it.

“The problem is that we do not and possibly never will have a model that captures even a majority of integrity harms, particularly in sensitive areas”, wrote a senior engineer and research scientist in a mid-2019 note.

He estimated that the company’s automated systems removed posts that generated just two percent of hate speech views on the platform that violated its rules. “Recent estimates suggest that unless there is a major change in strategy, it will be tough to improve this beyond 10-20 percent in the short-medium term”, he wrote.

This March, another team of Facebook employees drew a similar conclusion, estimating that those systems were removing posts that generated three to five per cent of the views of hate speech on the platform and 0.6 per cent of all content that violated Facebook’s policies against violence and incitement.

Facebook does take some other additional steps to reduce hate speech views (beyond AI screening). They told the Journal — also arguing that the internal Facebook documents the Journal had reviewed were outdated.

One of those documents showed that in 2019 Facebook was spending $104 million a year to review suspected hate speech, with a Facebook manager noting that “adds up to real money” and proposing “hate speech cost controls”.

Facebook told the Journal the saved money went to better improve their algorithms. But the Journal reports that Facebook “also introduced ‘friction’ to the content reporting process, adding hoops for aggrieved users to jump through that sharply reduced how many complaints about content were made, according to the documents”.

Last modified on 18 October 2021
Rate this item
(1 Vote)

Read more about: