Advanced Search
ZHENG Litong, HONG Feng, ZHENG Wan, et al. A short speech speaker recognition based on transfer learning and multi-scale loss[J]. Technical Acoustics, 2025, 44(0): 1-10. DOI: 10.16300/j.cnki.1000-3630.24011601
Citation: ZHENG Litong, HONG Feng, ZHENG Wan, et al. A short speech speaker recognition based on transfer learning and multi-scale loss[J]. Technical Acoustics, 2025, 44(0): 1-10. DOI: 10.16300/j.cnki.1000-3630.24011601

A short speech speaker recognition based on transfer learning and multi-scale loss

  • In speaker recognition application scenarios such as access control or time and attendance oriented, Chinese short digit string corpus can improve the user experience, however, at the cost of its performance degradation is obvious. To this end, this paper proposes a short speech-based speaker recognition that consists of a model pre-training phase and a transfer learning phase. First, an improved pre-training model is proposed, which effectively improves the generalization ability of the text-independent speaker recognition model through feature enhancement and preheating network. Meanwhile, this paper proposes a multi-subcenter cross-entropy speaker classification loss, which effectively improves the adaptation ability from the source domain to the target domain in the transfer learning phase. In addition, a long and short speech embedding code relative entropy loss is proposed to improve the performance by mapping the short speech embedding code distribution to the long speech distribution which is richer in timbre information. Experimental results on the Chinese short speech dataset SHAL show that the pre-trained model proposed in this paper has high generalization ability, and the joint loss consisting of multi-subcenter cross-entropy loss and long and short speech embedding code relative entropy loss can also effectively improve the performance of the model.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return
    Disclaimer: The English version of this article is automatically generated by iFLYTEK Translation and only for reference. We therefore are not responsible for its reasonableness, correctness and completeness, and will not bear any commercial and legal responsibilities for the relevant consequences arising from the English translation.