The Dark Side of the Language: Pre-trained Transformers in the DarkNet. (arXiv:2201.05613v1 [cs.CL])

Pre-trained Transformers are challenging human performances in many natural
language processing tasks. The gigantic datasets used for pre-training seem to
be the key for their success on existing tasks. In this paper, we explore how a
range of pre-trained natural language understanding models perform on truly
novel and unexplored data, provided by classification tasks over a DarkNet
corpus. Surprisingly, results show that syntactic and lexical neural networks
largely outperform pre-trained Transformers. This seems to suggest that
pre-trained Transformers have serious difficulties in adapting to radically
novel texts.



Related post