This AI detects Malaysian hate speech. We try insulting it.

Share this article:

Have you ever wondered how hateful your words are? Do people react like this whenever you open your mouth, but you’re not sure why?

Well wonder no longer! Recently, a local think tank called The Centre Malaysia launched an online tracker designed to sniff out hate speech among Malaysian netizens. Hate speech generally includes speech that insult, threaten, or target certain groups (like race, gender, religion, sexual orientation, etc), and the software they developed – called the Tracker Benci – uses machine learning to detect such vicious words online, particularly on Twitter.

As for why they’re doing it, it seems that they’re trying to identify trends in Malaysian online hate speech without overly relying on labor-intensive monitoring methods..

“At The Centre, we believe that responding to hate speech in Malaysia requires detecting it consistently, which can then be used for further work to determine its seriousness, impact, and the appropriate response to it.” – excerpt from the Tracker Benci’s FAQ section.

What we’re interested in, though, is the Benci Calculator they’ve made available on their website. Working on the same principles as their tracker, you can enter a phrase in the calculator, and it’ll tell you whether the phrase is hateful or not, what kind of hateful it is, and to whom it’s hateful to.

So since it’s a slow day at work…

We hurled racial insults at the calculator to see how well it detects them

[Disclaimer: The use of racial slurs here are purely for research purposes. We do not believe in stereotyping, nor do we condone the use of such slurs.]

The green arrows show the Malay Muslim, Chinese and Indian hate detected through the Tracker so far. Screengrabbed from #TrackerBenci.

Based on their data so far, it seems that the tracker detects hateful tweets directed at 7 groups: Malay Muslims, the Chinese, Indians, migrant workers, women, LGBT people, and minorities. So despite our initial excitement to test ‘mak kau hijau’, it wouldn’t be relevant to this calculator as mothers aren’t a valid group. We decided to go with racial insults instead as 80% of the hate detected by the tracker so far had been racial, so logically the machine learning had the most experience with these.

As a control, we first put in two non-offensive phrases: ‘kasut tumit tinggi‘ (high-heeled shoes) and ‘aku benci taugeh‘ (I hate bean sprouts). The calculator told us that high-heeled shoes are ‘potentially based in a foreign and/or irrelevant context‘ – which is a common result for many of the phrases we’ve entered – but apparently hating bean sprouts is ‘potentially hateful‘. We’re pretty sure that taugeh isn’t a slang term for a race or group, so that’s not a very good start already.

Nobody cares about taugeh. Go away taugeh. Tauge img from KlikDokter.

Next we decided to go with a common hate comment: telling people to balik China/India. The calculator seems to only detect hate in Malay and English, so the phrase ‘balik tongsan‘ is seen as ‘foreign and/or irrelevant‘. ‘Balik Cina‘ gives us the same foreign/irrelevant comment, but now it detects the phrase as ‘potentially hateful‘, ‘potentially targeted at the Chinese community‘, and ‘can potentially increase tension between groups‘ as well.

Interestingly, if you’re being correct with your spelling and typed ‘balik China‘ instead, it only registers the phrase as ‘foreign/irrelevant‘ again. The same goes for ‘balik India‘. With weird spellings in mind, we tested ‘cina DAPig‘, and the calculator detected it: the results are the same as ‘balik Cina‘: potentially hateful, targeted at the Chinese, and can increase tensions.

Next are common stereotypes. First is drunkenness: we tried inputting the phrase ‘melayu mabuk‘, ‘cina mabuk‘, and ‘india mabuk‘ separately. Only one is deemed as potentially hateful, and it’s not the third one: ‘cina mabuk’ and ‘india mabuk’ are foreign/irrelevant, but ‘melayu mabuk’ is potentially hateful for some reason. You need to say ‘orang cina mabuk‘ to get a potentially hateful result instead of just ‘cina mabuk’, but the same can’t be said for Indians. The same result happened when we extended the second half to ‘mabuk ketum‘: only for the Malays is it potentially hateful.

Well, it had been quite a touchy topic recently. Lulzy img from Winepak Corporations Sdn Bhd and Parlimen Malaysia’s YouTube, taken from Coconuts KL.

For laziness, we tacked the word ‘malas‘ after the three races, and although we got ‘potentially hateful‘ for lazy ‘melayu’ and ‘cina’, ‘india malas‘ was said to be probably not hateful. Running out of insults, we tacked ‘babi‘ (pig) after each of the races, and again only ‘india babi‘ is not hateful. We have to note that it’s a bit uncommon to get the ‘probably not hateful’ result: even seemingly non-offensive phrases like the high-heeled shoes from earlier are said to be probably foreign/irrelevant instead of not hateful. Weirdly enough, ‘melayu babi‘ ‘perpetuates a negative stereotype‘ on top of being hateful, something not seen on the other races.

Anyway, at this point, we’ve seen enough. After the tests…

It seems that #TrackerBenci still has quite some learning to do

He’s trying his best. Gif from VentureBeat.

When it comes to unfeeling machines detecting hate, there are a lot of challenges involved, with a common one being context. Based on the Benci Calculator’s methodology page, it seems that the tracker relies on keywords – a ‘reference list of hateful words‘ to be exact – and if the calculator doesn’t detect a phrase or term as hateful, the reason given is that it’s probably missing elements of context or the environment in which it is used.

“For this Version 1.0 of #TrackerBenci, the reference list of words comprised [of] English, Bahasa Malaysia and Malaysian colloquial words, selected by The Centre’s diverse (though small) panel of locally-informed researchers who sifted through thousands of tweets to develop the training set.” – excerpt from #TrackerBenci’s FAQ.

This might explain why a keyword-containing phrase, ‘masjid kapitan keling‘ – a real landmark in Penang – was detected as potentially ‘hateful’, ‘insulting’, ‘perpetuates a negative stereotype’, ‘can increase tensions between groups’, and ‘targeted at the Indian community’, while the more contextual ‘bising pulak puak puak pendatang ni‘ – a common passive-aggressive comment used to invalidate comments made by members of the non-Malay communities online – was categorized as ‘probably not a hateful speech‘.

The calculator is still in early stages of development, after all, so if you have some of free time, you can help train the calculator by testing it out and giving feedback on the site itself.

It’s like a menu you see after summoning the manager. Screengrabbed from #TrackerBenci.

Just don’t go overboard with the suggestions ar.

Share this article:

NAH, BACA:

8 bizarre statistics about Malaysian mobile data usage