A short speech speaker recognition method based on transfer learning and multi-scale loss

ZHENG Litong; HONG Feng; ZHENG Wan; XU Weijie

doi:10.16300/j.cnki.1000-3630.24011601

ZHENG Litong, HONG Feng, ZHENG Wan, et al. A short speech speaker recognition method based on transfer learning and multi-scale loss[J]. Technical Acoustics, 2025, 44(4): 565-574. DOI: 10.16300/j.cnki.1000-3630.24011601

Citation:

A short speech speaker recognition method based on transfer learning and multi-scale loss

Abstract

Abstract

In speaker recognition application scenarios such as access control or time and attendance oriented, Chinese short digit string corpus can improve the user experience. However, at the cost of recognition performance degradation is obvious. Therefore, this paper proposes a short speech-based speaker recognition framework which consists of a model pre-training phase and a transfer learning phase. Firstly, an improved pre-training model is proposed, which effectively improves the generalization ability of the text-independent speaker recognition model through feature enhancement and preheating network. Secondly, this paper proposes a multi-subspace cross-entropy speaker classification loss, which effectively improves the adaptation ability from the source domain to the target domain in the transfer learning stage. Finally, a long and short speech embedding code relative entropy loss is proposed to improve the performance by mapping the short speech embedding code distribution to the long speech distribution which is richer in timbre information. Experimental results on the Chinese short speech dataset SHAL show that the pre-trained model proposed in this paper has high generalization ability, and the joint loss consisting of multi-subspace cross-entropy loss and long and short speech embedding code relative entropy loss can also effectively improve the performance of the model.

FullText(HTML)

References (21)

Cited By

Turn off MathJax

Article Contents

A short speech speaker recognition method based on transfer learning and multi-scale loss

Abstract

Catalog

Export File

Citation

Format

Content