结合多尺度卷积网络和双端注意力机制的水声目标识别

刘承伟; 洪峰; 冯海泓; 胡梦璐

doi:10.16300/j.cnki.1000-3630.2023.02.006

结合多尺度卷积网络和双端注意力机制的水声目标识别

Underwater acoustic target recognition based on dual attention networks and multiresolution convolutional neural networks

摘要

摘要: 水声目标识别是被动声呐系统的主要应用之一。为了进一步提升小样本条件下水下目标的识别率，文章提出一种基于多尺度卷积和双端注意力机制相融合的方法。首先，提取梅尔倒谱系数,色度谱和计算谱对比度等特征，建立基于多类别特征子集的三维聚合特征。其次，采用多尺度卷积滤波器算子构造多分辨率卷积神经网络，以更好地适应三维聚合特征的时频结构。另外，采用双端注意力模型捕获样本的全局依赖和局部特性。采用基于指数加权的对数交叉熵函数作为损失函数，提升样本数较少类别的识别率。实验结果表明，该方法在ShipsEar数据上的平均识别率为95.5%，取得了较好的分类效果。

Abstract: Underwater acoustic target recognition (UATR) based on radiated noise is one of the main passive sonar applications. To further improve the classification accuracy of underwater target with small sample, a novel method based on dual attention networks (DAN) and a multiresolution convolutional neural network (DAN-MCNN) is proposed. Firstly, the three-dimensional (3D) aggregated features are designed by the multi-class feature subsets, which are composed of MFCC, Log-Mel spectrogram, chroma, spectral contrast, and tonnetz. Then, based on the frequency perception mechanism of the human ear and the auditory attention mechanism, a multi-resolution pooling and convolution scheme is adopted to construct the MCNN architecture, which can better adapt to the time-frequency structure of the 3D aggregated characteristics. Besides, the DAN module is used to capture the global dependence and local characteristics of samples. An exponentially weighted categorical cross-entropy (EWCE) is taken as the loss function to improve the recognition rate of categories with fewer samples. The experimental results show that the proposed approach achieves average recognition accuracy of 95.5% in the ShipsEar dataset, which is the best classification result.

HTML全文

参考文献(18)

施引文献

资源附件(0)