Privacy-Preserving Speech Representation Learning using Vector Quantization. (arXiv:2203.09518v1 [eess.AS])

With the popularity of virtual assistants (e.g., Siri, Alexa), the use of
speech recognition is now becoming more and more widespread.However, speech
signals contain a lot of sensitive information, such as the speaker’s identity,
which raises privacy concerns.The presented experiments show that the
representations extracted by the deep layers of speech recognition networks
contain speaker information.This paper aims to produce an anonymous
representation while preserving speech recognition performance.To this end, we
propose to use vector quantization to constrain the representation space and
induce the network to suppress the speaker identity.The choice of the
quantization dictionary size allows to configure the trade-off between utility
(speech recognition) and privacy (speaker identity concealment).



