Diversity-Aware Batch Active Learning for Dependency Parsing. (arXiv:2104.13936v1 [cs.CL])

While the predictive performance of modern statistical dependency parsers
relies heavily on the availability of expensive expert-annotated treebank data,
not all annotations contribute equally to the training of the parsers. In this
paper, we attempt to reduce the number of labeled examples needed to train a
strong dependency parser using batch active learning (AL). In particular, we
investigate whether enforcing diversity in the sampled batches, using
determinantal point processes (DPPs), can improve over their diversity-agnostic
counterparts. Simulation experiments on an English newswire corpus show that
selecting diverse batches with DPPs is superior to strong selection strategies
that do not enforce batch diversity, especially during the initial stages of
the learning process. Additionally, our diversityaware strategy is robust under
a corpus duplication setting, where diversity-agnostic sampling strategies
exhibit significant degradation.

Source: https://arxiv.org/abs/2104.13936


