Speaker identification in noisy environment with use of the precise model of the human auditory system
Proc. of the International MultiConference of Engineers and Computer Scientists (IMECS2012), pp.92-95
T. Azetsu, M. Abuku, N. Suetake, E. Uchino
Oral presentation (general)
This paper discusses an approach for speaker identification in noisy environment using the multi-dimensional
pulse signals generated from the model of a human peripheral auditory system. The peripheral auditory model employed here consists of a basilar membrane, hair cells, and auditory nerves. The input to this model is a speech signal divided into frames, and the outputs of which are the multi-dimensional pulse signals for each framed signal. The feature vectors based on the poststimulus time histogram (PSTH) of the pulse signals are used for the speaker identification. In this paper, we propose to set adaptively the threshold of the action potential for pulse generation in the auditory nerve model. In order to verify the performance of noise immunity for the speaker identification, the experiments were conducted for each Japanese vowel spoken by 12 speakers (9 males and 3 females). The effectiveness of using the peripheral auditory model has been verified
by comparing with the methods using the conventional LPC spectrum and using the excitation patterns.