Abstract:
Currently, deaf people mainly use sign language to communicate with healthy people, however, most healthy people are untrained in sign language training. Therefore, it is of great importance to translate the sign language into spoken language using deaf accents that can be comprehended by the healthy people. To investigate the feasibility of text to speech (TTS) for the deaf people, the speech characteristics are analyzed firstly in this paper, and then, the TTS algorithms, which are capable of generating high naturalness and clarity speeches with deaf people's own voice characteristics, and the evaluation methods for these algorithms are developed. In this paper, a voice conversion and TTS method for mildly disabled deaf people and a voice cloning method for sever deaf people based on the characteristics of their speech are proposed. According to the analysis results, the voice of the mildly disabled deaf person has some similarities with the healthy voice, so the AdaIN-VC speech conversion model is used to convert the voice with the timbre and high understanding of the deaf person, and the converted voice is combined with the Tacotron2 speech synthesis model to map the text to the speech. Considering the instability of severely disabled deaf speech, the ECAPA-TDNN is used as the speaker coder for the tone representation of severely disabled deaf people to obtain accurate deaf representations. In addition, the style migration module based on the base frequency emotion classification is introduced to transfer the style of the synthetic speech. The experimental results show that under the condition of ensuring certain similarity, the subjective opinion scores of the two mild deaf people in the experiment increased from 2.53 and 3.06 to 2.88 and 3.21, respectively, and the misword rate of speech recognition is reduced from 100% to 80.77% and 76.91%, respectively. Similarly, the rate of subjective miswords proposed in the paper has also decreased significantly. However, in the experiment of speech cloning, the subjective similarity opinion score for the similarity of the severely disabled deaf speech and its own timbre reached 3, and the natural subjective opinion score and emotional expression ability of the deaf speech are improved.