Advanced Search
ZHENG Litong, HONG Feng, ZHENG Wan, et al. A short speech speaker recognition method based on transfer learning and multi-scale loss[J]. Technical Acoustics, 2025, 44(4): 1-10. DOI: 10.16300/j.cnki.1000-3630.24011601
Citation: ZHENG Litong, HONG Feng, ZHENG Wan, et al. A short speech speaker recognition method based on transfer learning and multi-scale loss[J]. Technical Acoustics, 2025, 44(4): 1-10. DOI: 10.16300/j.cnki.1000-3630.24011601

A short speech speaker recognition method based on transfer learning and multi-scale loss

  • In speaker recognition application scenarios such as access control or time and attendance oriented, Chinese short digit string corpus can improve the user experience. However, at the cost of recognition performance degradation is obvious. Therefore, this paper proposes a short speech-based speaker recognition framework which consists of a model pre-training phase and a transfer learning phase. Firstly, an improved pre-training model is proposed, which effectively improves the generalization ability of the text-independent speaker recognition model through feature enhancement and preheating network. Secondly, this paper proposes a multi-subspace cross-entropy speaker classification loss, which effectively improves the adaptation ability from the source domain to the target domain in the transfer learning stage. Finally, a long and short speech embedding code relative entropy loss is proposed to improve the performance by mapping the short speech embedding code distribution to the long speech distribution which is richer in timbre information. Experimental results on the Chinese short speech dataset SHAL show that the pre-trained model proposed in this paper has high generalization ability, and the joint loss consisting of multi-subspace cross-entropy loss and long and short speech embedding code relative entropy loss can also effectively improve the performance of the model.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return
    Disclaimer: The English version of this article is automatically generated by iFLYTEK Translation and only for reference. We therefore are not responsible for its reasonableness, correctness and completeness, and will not bear any commercial and legal responsibilities for the relevant consequences arising from the English translation.