Language model based methods are powerful techniques for text classification.
However, the models have several shortcomings. (1) It is difficult to integrate
human knowledge such as keywords. (2) It needs a lot of resources to train the
models. (3) It relied on large text data to pretrain. In this paper, we propose
Semi-Supervised vMF Neural Topic Modeling (S2vNTM) to overcome these
difficulties. S2vNTM takes a few seed keywords as input for topics. S2vNTM
leverages the pattern of keywords to identify potential topics, as well as
optimize the quality of topics’ keywords sets. Across a variety of datasets,
S2vNTM outperforms existing semi-supervised topic modeling methods in
classification accuracy with limited keywords provided. S2vNTM is at least
twice as fast as baselines.