Abusive language and hate speech that incites violence is a growing problem for governments trying to manage social media posts on the internet. Such language incites civil disorder by angering target individuals and groups, and tends to reinforce the identity and cohesiveness of those who use it. Abusive language is an intelligence target because it can be a leading indicator of violence. Black box predictors can achieve high accuracy but they reveal nothing about the structures, both social and technical, that underlie abusive language. We build a set of abusive language predictors leveraging both social constructs such as otherness, and linguistic properties. A stacking predictor then determines the most significant component predictors. We achieve prediction accuracy of 96.5%, with an 88% accuracy for abusive documents. This is an improvement of more than 16% points in detecting abusive documents compared to a popular empirically derived predictors, and provides insights into the mechanics of abusive language.
Publication Information
Leuprecht, Christian, David B. Skillicorn, and David Kernot. 2024. “Linguistic Models of Abusive Language.” Dynamics of Asymmetric Conflict, November, 1–17. doi:10.1080/17467586.2024.2407922.