Deep Quality Estimation: Creating Surrogate Models for Human Quality Ratings. (arXiv:2205.10355v1 [cs.CV])

Human ratings are abstract representations of segmentation quality. To
approximate human quality ratings on scarce expert data, we train surrogate
quality estimation models. We evaluate on a complex multi-class segmentation
problem, specifically glioma segmentation following the BraTS annotation
protocol. The training data features quality ratings from 15 expert
neuroradiologists on a scale ranging from 1 to 6 stars for various
computer-generated and manual 3D annotations. Even though the networks operate
on 2D images and with scarce training data, we can approximate segmentation
quality within a margin of error comparable to human intra-rater reliability.
Segmentation quality prediction has broad applications. While an understanding
of segmentation quality is imperative for successful clinical translation of
automatic segmentation quality algorithms, it can play an essential role in
training new segmentation models. Due to the split-second inference times, it
can be directly applied within a loss function or as a fully-automatic dataset
curation mechanism in a federated learning setting.



Related post