The rise of online hate speech is a disturbing, growing trend in countries around the world, with serious psychological consequences and the potential to impact, and even contribute to, real-world violence.
Citizen-generated counter speech may help discourage hateful online rhetoric, but it has been difficult to quantify and study.
"I've seen this big shift in civil discourse in the last two or three years towards being much more hateful and much more polarized," said study co-author Joshua Garland, a mathematician and Applied Complexity Fellow at the Santa Fe Institute in the US.
"So, for me, an interesting question was: what's an appropriate response when you're being cyber-bullied or when you're receiving hate speech online? Do you respond? Do you try to get your friends to help protect you? Do you just block the person?"
The new paper published in the proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), held from Nov 16-20, offers a framework for studying the dynamics of online hate and counter speech.
To study such questions scientifically, researchers first needed to have access to a wealth of real-world data on both hate speech and counter-speech, and the ability to distinguish between the two.
That data existed, and Garland and collaborator Keyan Ghazi-Zahedi at the Max Planck Institute in Germany found it in a five-year interaction that played out over German Twitter: As an alt-right group took to the platform with hate speech, an organised movement rose up to counter it.
"The beauty of these two groups is they were self-labeling," explained Mirta Galesic, Professor at Santa Fe Institute.
Garland and Ghazi-Zahedi were able to input the self-labeled posts into a machine-learning algorithm to automate large swaths of the classification.
The team also relied on 20-30 human coders to check that the machine classifications matched up with intuition about what registers as hate and counter-speech.
The result was a dataset of unprecedented size that allows the researchers to analyse not just isolated instances of hate and counter speech, but also compare long-running interactions between the two.
The team collected one dataset of millions of tweets posted by members of the two groups, using these self-identified tweets to train their classification algorithm to recognise hate and counter speech.
Then, they applied their algorithm to study the dynamics of some 200,000 conversations that occurred between 2013 and 2018.