Comparison of Machine Learning for Sentiment Analysis in Detecting Anxiety Based on Social Media Data. (arXiv:2101.06353v1 [cs.CL])

All groups of people felt the impact of the COVID-19 pandemic. This situation
triggers anxiety, which is bad for everyone. The government’s role is very
influential in solving these problems with its work program. It also has many
pros and cons that cause public anxiety. For that, it is necessary to detect
anxiety to improve government programs that can increase public expectations.
This study applies machine learning to detecting anxiety based on social media
comments regarding government programs to deal with this pandemic. This concept
will adopt a sentiment analysis in detecting anxiety based on positive and
negative comments from netizens. The machine learning methods implemented
include K-NN, Bernoulli, Decision Tree Classifier, Support Vector Classifier,
Random Forest, and XG-boost. The data sample used is the result of crawling
YouTube comments. The data used amounted to 4862 comments consisting of
negative and positive data with 3211 and 1651. Negative data identify anxiety,
while positive data identifies hope (not anxious). Machine learning is
processed based on feature extraction of count-vectorization and TF-IDF. The
results showed that the sentiment data amounted to 3889 and 973 in testing, and
training with the greatest accuracy was the random forest with feature
extraction of vectorization count and TF-IDF of 84.99% and 82.63%,
respectively. The best precision test is K-NN, while the best recall is
XG-Boost. Thus, Random Forest is the best accurate to detect someone’s anxiety
based-on data from social media.



Related post