可学习Gabor卷积网络的船舶辐射噪声识别

谭天懿; 陈宗天; 严诺霄; 何宇欣; 林巨

doi:10.16300/j.cnki.1000-3630.24040901

可学习Gabor卷积网络的船舶辐射噪声识别

Ship-radiated noise recognition with learnable Gabor convolutional neural networks

摘要

摘要: 船舶辐射噪声识别系统在实际应用中，受到水下生物活动、其他船舶运行，以及自然现象如风浪等产生的水下噪声的影响，分类识别往往不佳。针对特征提取能力，本研究结合时频分析和图像处理技术，提出一种参数可学习的端到端加博(Gabor)卷积网络舰船噪声识别方法，使用一维卷积神经网络(convolutional neural network, CNN)实现短时傅里叶变换核，通过二维 CNN实现二维 Gabor滤波器组，构建Gabor卷积层处理Log-Mel谱图，输出优化的时频表示形式。该方法通过模型训练学习过程，可以掌握从波形到时频谱图的转换参数，有利于捕捉更多的目标船舶辐射噪声所含信息。测试结果显示，相较于基本模型框架CNN10，所提方法在DeepShip数据集上的识别精度由65.49%提升至72.24%，并且实现了适于音频模式识别、大规模预训练音频神经网络上迁移学习，所提出方法模型总参数量仅增加1.06 ×10⁶，有效提升训练速度和识别精度，在不同海洋自然环境噪声、低截止频率和时频变化等多种干扰因素下具有较好的鲁棒性。

Abstract: The performance of ship radiated noise recognition systems is often weakened by underwater noise stemming from biological activities, concurrent ship operations, and natural phenomena such as wind and waves in practical applications. This paper proposes an end-to-end recognition method with learnable parameters that integrates time-frequency analysis and image processing techniques to enhance the feature extraction capability of underwater target recognition systems. The method utilizes a one dimensional convolutional neural network (CNN) to implement the short time Fourier transform (STFT) kernel, employs a two dementional (2D) CNN to implement 2D Gabor filter banks, and constructs Gabor convolutional layers to process Log-Mel spectrograms, ultimately generating optimized time-frequency representations. Through the training process, the system can learn the conversion parameters from waveform to time-frequency representations, enabling it to capture more information about the target ship radiated noise. Results indicate that compared with the reference algorithm CNN10, the proposed method improves the recognition accuracy on the DeepShip dataset from 65.49% to 72.24%, can reuse large-scale pretrained audio neural networks for audio pattern recognition (PANNs) through transfer learning approaches, and increases the total number of parameters by only 1.06 ×10⁶. This effectively improves the training speed and recognition accuracy and demonstrates better robustness under different interference factors, such as natural environmental noise, low cutoff frequency, and time-frequency variation.

HTML全文

参考文献(24)

施引文献

资源附件(0)