Advanced Search
ZHU Wenbo, WU Jing, JIN Hao, et al. Speaker recognition model based on multi-granularity spatio-temporal attention mechanism[J]. Technical Acoustics, 2025, 44(1): 93-101. DOI: 10.16300/j.cnki.1000-3630.23060601
Citation: ZHU Wenbo, WU Jing, JIN Hao, et al. Speaker recognition model based on multi-granularity spatio-temporal attention mechanism[J]. Technical Acoustics, 2025, 44(1): 93-101. DOI: 10.16300/j.cnki.1000-3630.23060601

Speaker recognition model based on multi-granularity spatio-temporal attention mechanism

  • Deep learning is widely applied in the field of speaker recognition. However, current models have the shortcoming in low recognition rates and high complex model parameters, making it difficult to achieve lightweight speech recognition. To address this issue, a speaker recognition model, named Multi-granularity Hybrid Compression Network (MGHC-NET), is proposed based on multi-granularity spatio-temporal attention mechanisms, which consists of a multi-granularity mixing module (MGMM), spatio-temporal attention mechanism module, and channel compression module. The MGMM and spatio-temporal attention mechanism module capture local temporal context features and spatial correlation feature information from a multi-scale modeling perspective, and couple the correlation features of different spatial-temporal information in a multi-granularity manner to enhance global spatio-temporal modeling capabilities. Meanwhile, the channel compression module aggregates different speaker channels and context-dependent representations to reduce the overall model parameters. Five-fold cross-validation experiments are conducted on multiple public datasets. The results show that the proposed method can effectively improve the speaker recognition accuracy and reduce the number of parameters, and achieve optimal performance compared to mainstream models. It has important application value in lightweight speaker recognition models.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return
    Disclaimer: The English version of this article is automatically generated by iFLYTEK Translation and only for reference. We therefore are not responsible for its reasonableness, correctness and completeness, and will not bear any commercial and legal responsibilities for the relevant consequences arising from the English translation.