Evaluation of Deep Learning Models for Hostility Detection in Hindi Text. (arXiv:2101.04144v1 [cs.CL])

The social media platform is a convenient medium to express personal thoughts
and share useful information. It is fast, concise, and has the ability to reach
millions. It is an effective place to archive thoughts, share artistic content,
receive feedback, promote products, etc. Despite having numerous advantages
these platforms have given a boost to hostile posts. Hate speech and derogatory
remarks are being posted for personal satisfaction or political gain. The
hostile posts can have a bullying effect rendering the entire platform
experience hostile. Therefore detection of hostile posts is important to
maintain social media hygiene. The problem is more pronounced languages like
Hindi which are low in resources. In this work, we present approaches for
hostile text detection in the Hindi language. The proposed approaches are
evaluated on the [email protected] 2021 Hindi hostility detection dataset. The
dataset consists of hostile and non-hostile texts collected from social media
platforms. The hostile posts are further segregated into overlapping classes of
fake, offensive, hate, and defamation. We evaluate a host of deep learning
approaches based on CNN and LSTM for this multi-label classification problem.
The pre-trained Hindi fast text word embeddings by IndicNLP and Facebook are
used in conjunction with these models to evaluate their effectiveness. We show
that the multi-CNN model when combined with IndicNLP FastText word embedding
gives the best results.

Source: https://arxiv.org/abs/2101.04144


Related post