Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices

Hernandez, Abner; Pérez-Toro, Paula Andrea; Vásquez-Correa, Juan Camilo; Orozco-Arroyave, Juan Rafael; Maier, Andreas; Yang, Seung Hee

Computer Science > Computation and Language

arXiv:2204.01677 (cs)

[Submitted on 4 Apr 2022]

Title:Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices

Authors:Abner Hernandez, Paula Andrea Pérez-Toro, Juan Camilo Vásquez-Correa, Juan Rafael Orozco-Arroyave, Andreas Maier, Seung Hee Yang

View PDF

Abstract:Collecting speech data is an important step in training speech recognition systems and other speech-based machine learning models. However, the issue of privacy protection is an increasing concern that must be addressed. The current study investigates the use of voice conversion as a method for anonymizing voices. In particular, we train several voice conversion models using self-supervised speech representations including Wav2Vec2.0, Hubert and UniSpeech. Converted voices retain a low word error rate within 1% of the original voice. Equal error rate increases from 1.52% to 46.24% on the LibriSpeech test set and from 3.75% to 45.84% on speakers from the VCTK corpus which signifies degraded performance on speaker verification. Lastly, we conduct experiments on dysarthric speech data to show that speech features relevant to articulation, prosody, phonation and phonology can be extracted from anonymized voices for discriminating between healthy and pathological speech.

Comments:	Submitted for review at Interspeech 2022
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2204.01677 [cs.CL]
	(or arXiv:2204.01677v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2204.01677

Submission history

From: Abner Hernandez [view email]
[v1] Mon, 4 Apr 2022 17:48:01 UTC (168 KB)

Computer Science > Computation and Language

Title:Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators