Human-in-the-loop Abstractive Dialogue Summarization. (arXiv:2212.09750v1 [cs.CL])
Abstractive dialogue summarization has received increasing attention
recently. Despite the fact that most of the current dialogue summarization
systems are trained to maximize the likelihood of human-written summaries and
have achieved significant results, there is still a huge gap in generating
high-quality summaries as determined by humans, such as coherence and
faithfulness, partly due to the misalignment in maximizing a single
human-written summary. To this end, we propose to incorporate different levels
of human feedback into the training process. This will enable us to guide the
models to capture the behaviors humans care about for summaries. Specifically,
we ask humans to highlight the salient information to be included in summaries
to provide the local feedback , and to make overall comparisons among summaries
in terms of coherence, accuracy, coverage, concise and overall quality, as the
global feedback. We then combine both local and global feedback to fine-tune
the dialog summarization policy with Reinforcement Learning. Experiments
conducted on multiple datasets demonstrate the effectiveness and generalization
of our methods over the state-of-the-art supervised baselines, especially in
terms of human judgments.
Source: https://arxiv.org/abs/2212.09750