利用波束形成和神经网络进行语音增强

Speech enhancement by beamforming and neural network

摘要: 语音增强在语音信号处理的前端非常重要，直接影响后端语音识别等效果。目前用神经网络进行单通道语音分离对于解决鸡尾酒会问题取得了很大的进步，但是用于复杂混合语音时分离效果仍不令人满意。针对单通道情形下的不足，使用多通道结构形成4个方向的超指向波束，结合神经网络算法实现对于指定方向的目标语音增强。仿真和实验结果表明，该算法相较于超指向波束形成算法和谱减法在多种评价指标上均有了明显的提升。

Abstract: Speech enhancement is a very important front-end work in speech signal processing and it directly affects the back-end's speech recognition effects. At present, the single-channel speech separation using neural networks has made great progress in solving cocktail party problem, but its separation effect on complex mixed speeches is not satisfactory. Aiming at the shortcoming of single channel, the multi-channel structure is adopted to form 4 super-directivity beams. The speech enhancement in a given direction is realized by using the multi-channel structure combined with neural network algorithm. The simulation and experimental results show that the proposed method has the obvious improvement in a variety of evaluation index compared with the super-directivity beamforming algorithm and spectral subtraction algorithm.