Social media mining for toxicovigilance of prescription medications: End-to-end pipeline, challenges and future work. (arXiv:2211.10443v1 [cs.CL])

Substance use, substance use disorder, and overdoses related to substance use
are major public health problems globally and in the United States. A key
aspect of addressing these problems from a public health standpoint is improved
surveillance. Traditional surveillance systems are laggy, and social media are
potentially useful sources of timely data. However, mining knowledge from
social media is challenging, and requires the development of advanced
artificial intelligence, specifically natural language processing (NLP) and
machine learning methods. We developed a sophisticated end-to-end pipeline for
mining information about nonmedical prescription medication use from social
media, namely Twitter and Reddit. Our pipeline employs supervised machine
learning and NLP for filtering out noise and characterizing the chatter. In
this paper, we describe our end-to-end pipeline developed over four years. In
addition to describing our data mining infrastructure, we discuss existing
challenges in social media mining for toxicovigilance, and possible future
research directions.



