Application of peripheral auditory model to speaker identification
Proc. of World Congress on Nature and Biologically Inspired Computing (NABIC2010), pp.673-678
M. Abuku, T. Azetsu, E. Uchino, N. Suetake
Oral presentation (general)
This paper discusses an approach for speaker identification using the multi-dimensional pulse signals generated from a model of a peripheral auditory system. The model of the peripheral auditory system employed here consists of a basilar membrane, hair cells, and auditory nerves. The input to this model is a speech signal divided into frames, and the outputs from which are the multi-dimensional pulse signals for each framed signal. The feature vectors based on the post-stimulus time histogram (PSTH) of the pulse signals are used for the speaker identification. Also, in order to improve the accuracy of the speaker identification, the feature vector conversion, using the mean and the diagonal matrix of standard deviations, is performed. The experiments were conducted for each Japanese vowel spoken by 12 speakers (9 males and 3
females), and the speaker identification accuracy is evaluated by 5 hold leave 2 out cross-validation for each vowel. The effectiveness of the proposed method has been verified by comparing with the conventional LPC analysis.