Isolated Word Speech Recognition Using Mixed Transform
محتوى المقالة الرئيسي
الملخص
Methods of speech recognition have been the subject of several studies over the past decade. Speech recognition has been one of the most exciting areas of the signal processing. Mixed transform is a useful tool for speech signal processing; it is developed for its abilities of improvement in feature extraction. Speech recognition includes three important stages, preprocessing, feature extraction, and classification. Recognition accuracy is so affected by the features extraction stage; therefore different models of mixed transform for feature extraction were proposed. The properties of the recorded isolated word will be 1-D, which achieve the conversion of each 1-D word into a 2-D form. The second step of the word recognizer requires, the application of 2-D FFT, Radon transform, the 1-D IFFT,and 1-D discrete wavelet transforms were used in the first proposed model, while discrete multicircularlet transform was used in the second proposed model. The final stage of the proposed models includes the use of the dynamic time warping algorithm for recognition tasks. The performance of the proposed systems was evaluated using forty different isolated Arabic words that are recorded fifteen times in a studio for speaker dependant. The result shows recognition accuracy of (91% and 89%) using discrete wavelet transform type Daubechies (Db1) and (Db4) respectively, and the accuracy score between (87%-93%) was achieved using
discrete multicircularlet transform for 9 sub bands.
تفاصيل المقالة
القسم
كيفية الاقتباس
المراجع
Abdulwahid ,H. " Design And Simulation of A Multidimensional Radon-Based OFDM System ", M.Sc. Thesis, Nahrain University,
Communications Engineering Department,June 2010.
Albert P. B. and Mikhae1 B. W. "A Survey of Mixed Transform Techniques for Speech and Image Coding", IEEE Xplor, pp.106-109,1999.
Alubady,I. " A Proposed Multicircularlet Mixed Transform and Its Application for Image Compression", M.Sc. Thesis,University of Baghdad, Electrical Engineering Department, 2009.
Chapaneri, S.V. " Spoken Digits Recognition using Weighted MFCC and Improved Features for Dynamic Time Warping", International J. of Computer Applications, Vol. 40, No.3, pp.6-12,February 2012.
Furtuna, T.F., "Dynamic Programming Algorithms in Speech Recognition", Revista Informatica Economică, Vol.46, No.2, pp. 94-99, Bucharest, 2008.
Geronimo, J., Hardin, D. & Massopust, P.,"Fractal Function and Wavelet Expansion Based on Several Functions", J. Approx. Theory, Vol. 78, PP. 373-401, 1994.
Holmes, J. and Holmes, W., "Speech Synthesis and Recognition", Second Edition, London and New York, 2001.
Ibraheem, A. K., "Image Reconstruction Using Hybrid Transform", M.Sc. Thesis, University of Baghdad, Electrical Engineering Department, 2010.
Li Dong, X., Kui Gu, C. & Ou Wang, "A Local Segmented Dynamic Time Warping Distance Measure Algorithm for Time Series Data Mining", Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, pp.1247-1252, August 2006.
Mutasher, S., “A Multi Transform Based Dynamic Time Warping Isolated Word Speech Recognition System’’, M.Sc. Thesis, University of Baghdad, Electrical Engineering Department, April,2010.
Qassim, A., “Arabic Phonemes Recognition Using Hybrid Technique", M.Sc. Thesis,University of Technology, Electrical and Electronic Engineering Department, January 2006.
Rabiner, L. and Juang, B. H.,"Fundemantals of Speech Recognition", Prentice –Hell , New Jercy, 1993.
Strela, V. and Walden, A.T. " Orthogonal and Biorthogonal Multiwavelets for Signal Denoising and Image Compression" Proc. SPIE, 3391, pp.96-107, 1998.
Trivedi,N., Kumar,V., and Singh,S., "Speech Recognition by Wavelet Analysis",International J. of Computer Applications, Vol.15, No.8, pp.27-32, February 2011.