Abstract:
Endpoint detection technique is one of the key techniques in speech signal processing. In order to improve the accuracy and robustness of endpoint detection in low signal-to-noise ratio (SNR) environment, an endpoint detection algorithm based on non-stationary noise suppression and modulation domain spectral subtraction combining with power normalized cepstrum distance is proposed. Firstly, the algorithm suppresses non-stationary noise and uses modulation domain spectral subtraction to eliminate residual noise, so as to improve signal-to-noise ratio and reduce speech distortion. Then, the power normalized cepstrum coefficients of each frame signal are extracted. By calculating the power normalized cepstrum distance between each frame signal and background noise, a robust endpoint detection parameter is obtained. Finally, the double threshold method is used to perform endpoint detection by using this parameter. The experimental results show that the speech frames and noise frames can be effectively distinguished by endpoint detection algorithm. Furthermore, the proposed method achieves better anti-noise robustness for different types of noises even in a low SNR environment.