Dual Stages of Speech Enhancement Algorithm Based on Super Gaussian Speech Models

Main Article Content

Humam Awad Hussein
Shams Moaied Hameed
Basheera M. Mahmmod
Sadiq H. Abdulhussain
Abir Jaafar Hussain

Abstract

Various speech enhancement Algorithms (SEA) have been developed in the last few decades. Each algorithm has its advantages and disadvantages because the speech signal is affected by environmental situations. Distortion of speech results in the loss of important features that make this signal challenging to understand. SEA aims to improve the intelligibility and quality of speech that different types of noise have degraded. In most applications, quality improvement is highly desirable as it can reduce listener fatigue, especially when the listener is exposed to high noise levels for extended periods (e.g., manufacturing). SEA reduces or suppresses the background noise to some degree, sometimes called noise suppression algorithms. In this research, the design of SEA based on different speech models (Laplacian model or Gaussian model) has been implemented using two types of discrete transforms, which are Discrete Tchebichef Transform and Discrete Tchebichef-Krawtchouk Transforms. The proposed estimator consists of dual stages of a wiener filter that can effectively estimate the clean speech signal. The evaluation measures' results show the proposed SEA's ability to enhance the noisy speech signal based on a comparison with other types of speech models and a self-comparison based on different types and levels of noise. The presented algorithm's improvements ratio regarding the average SNRseq are 1.96, 2.12, and 2.03 for Buccaneer, White, and Pink noise, respectively.

Article Details

How to Cite
“Dual Stages of Speech Enhancement Algorithm Based on Super Gaussian Speech Models” (2023) Journal of Engineering, 29(09), pp. 1–13. doi:10.31026/j.eng.2023.09.01.
Section
Articles

How to Cite

“Dual Stages of Speech Enhancement Algorithm Based on Super Gaussian Speech Models” (2023) Journal of Engineering, 29(09), pp. 1–13. doi:10.31026/j.eng.2023.09.01.

Publication Dates

References

Abdulhussain, S.H., Ramli, A.R., Mahmmod, B.M., Saripan, M.I., Al-Haddad, S.A.R., and Jassim, W.A., 2019. A New Hybrid form of Krawtchouk and Tchebichef Polynomials: Design and Application. Journal of Mathematical Imaging and Vision, 61(4), pp. 555–570. Doi:10.1007/s10851-018-0863-4.

Abdulhussain, S.H., Mahmmod, B.M., Baker, T., and Al‐Jumeily, D., 2022. Fast and accurate computation of high‐order Tchebichef polynomials. Concurrency and Computation. Practice and Experience, 34(27), P. 7311 Doi:10.1002/cpe.7311.

Abood, Z.I., 2023. Image Compression Using 3-D Two-Level Technique. Journal of Engineering, 19(11), pp. 1407–1424. Doi:10.31026/j.eng.2013.11.05.

Abou-Loukh, S.J., and Abdul-Razzaq, S. M., 2023. Isolated Word Speech Recognition Using Mixed Transform. Journal of Engineering, 19(10), pp. 1271–1286. Doi:10.31026/j.eng.2013.10.06.

Aghajan, H., Augusto, J.C., and Delgado, R.L.C., 2009. Human-centric interfaces for ambient intelligence. Academic Press.

Elert, G., 2016. The nature of sound--the physics hypertextbook. physics. info. Retrieved, pp. 6–20.

Feinsilver, P., and Kocik, J., 2005. Krawtchouk Polynomials and Krawtchouk Matrices. In: Recent Advances in Applied Probability. Boston: Kluwer Academic Publishers, pp. 115–141. Doi:10.1007/0-387-23394-6_5.

Garofolo, J.S., 1993. Timit acoustic phonetic continuous speech corpus. Linguistic Data Consortium, 1993. Doi:10.35111/17gk-bn40.

Ghorpade, K., and Khaparde, A., 2023. Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning. Circuits, Systems, and Signal Processing, pp. 1-19, Doi:10.1007/s00034-023-02324-3.

Hasan, T., and Hasan, M.K., 2010. MMSE estimator for speech enhancement considering the constructive and destructive interference of noise. IET Signal Processing, 4(1), pp. 1–11. Doi:10.1049/iet-spr.2008.0114.

Idan, Z.N., Abdulhussain, S.H., Mahmmod, B.M., Al-Utaibi, K.A., Al-Hadad, S.A.R., and Sait, S.M., 2021. Fast shot boundary detection based on separable moments and support vector machine. IEEE Access, 9, pp. 106412-106427. Doi:10.1109/ACCESS.2021.3100139.

Jerjees, S. A., Mohammed, H. J., Radeaf, H. S., Mahmmod, B. M., and Abdulhussain, S. H., 2023. Deep Learning-Based Speech Enhancement Algorithm Using Charlier Transform. In 2023 15th International Conference on Developments in eSystems Engineering (DeSE) pp. 100-105. IEEE. Doi:10.1109/DeSE58274.2023.10099854.

Jassim, W.A., and Raveendran, P., 2012. December. Face Recognition Using Discrete Tchebichef-Krawtchouk Transform. 2012 IEEE International Symposium on Multimedia. pp. 120–127. Doi:10.1109/ISM.2012.31.

Kolbæk, M., Tan, Z.H., and Jensen, J., 2016. Speech intelligibility potential of general and specialized deep neural network based speech enhancement systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(1), pp. 153-167. Doi:10.1109/TASLP.2016.2628641.

Kotz, S., Kozubowski, T., and Podgórski, K., 2001. The Laplace distribution and generalizations: a revisit with applications to communications, economics, engineering, and finance. Springer Science & Business Media. Doi:10.1007/978-1-4612-0173-1.

Kumar, M.A., and Chari, K.M., 2019. Noise reduction using modified wiener filter in digital hearing aid for speech signal enhancement. Journal of Intelligent Systems, 29(1), pp. 1360–1378. Doi:10.1515/jisys-2017-0509.

Loizou, P.C., 2013. Speech enhancement: theory and practice. CRC press, Boca Raton, FL.

Mahmmod, B.M., Ramli, A.R., Abdulhussian, S.H., Al-Haddad, S.A.R., and Jassim, W.A., 2017. Low-Distortion MMSE Speech Enhancement Estimator Based on Laplacian Prior. IEEE Access, 5(1), pp. 9866–9881. Doi:10.1109/ACCESS.2017.2699782.

Mahmmod, B.M., Ramli, A.R., Abdulhussian, S.H., Al-Haddad, S.A.R., and Jassim, W.A., 2018. Signal compression and enhancement using a new orthogonal-polynomial-based discrete transform. IET Signal Processing, 12(1), pp. 129–142. Doi:10.1049/iet-spr.2016.0449.

Mahmmod, B.M., Ramli, A.R., Baker, T., Al-Obeidat, F., Abdulhussian, and S.H., Jassim, W.A., 2019. Speech Enhancement Algorithm Based on Super-Gaussian Modeling and Orthogonal Polynomials. IEEE Access, 7, pp. 103485–103504. Doi:10.1109/ACCESS.2019.2929864.

Mahmmod, B.M., Abdulhussian, S.H., Naser, M.A., Alsabah, M., and Mustafina, J., 2021. Speech Enhancement Algorithm Based on a Hybrid Estimator. In: IOP Conference Series: Materials Science and Engineering. Samawah, Iraq: IOPscience, P. 012102. Doi:10.1088/1757-899X/1090/1/012102.

Mahmmod, B.M., Abdulhussain, S.H., Suk, T., and Hussain, A., 2022. Fast Computation of Hahn Polynomials for High Order Moments. IEEE Access, 10, pp. 48719–48732, Doi:10.1109/ACCESS.2022.3170893.

Mahmood, B.M.R., Younis, M.I., and Ali, H. M., 2013. Construction of a General-Purpose Infrastructure for Rfid – Based Applications. Journal of Engineering, 19(11), 1425–1441. Doi:10.31026/j.eng.2013.11.06.

Nabi, W., Aloui, N., and Cherif, A., 2016. Speech enhancement in dual-microphone mobile phones using Kalman filter. Applied Acoustics, 109, pp. 1-4. Doi:10.1016/j.apacoust.2016.02.009.

Omatu, S., and Seinfeld, J.H., 1981. Filtering and smoothing for linear discrete-time distributed parameter systems based on Wiener-Hopf theory with application to estimation of air pollution. IEEE Transactions on Systems, Man, and Cybernetics, 11(12), pp. 785–801. Doi:10.1109/TSMC.1981.4308618.

Scalart, P., Vieira Filho, J., and Chiquito, J.G., 1996. On speech enhancement algorithms based on MMSE estimation. In 1996 8th European Signal Processing Conference (EUSIPCO 1996) (pp. 1-4). IEEE.

Shi, S., Paliwal, K., and Busch, A., 2023. On DCT-based MMSE estimation of short time spectral amplitude for single-channel speech enhancement. Applied Acoustics, 202, P.109134, Doi:10.1016/j.apacoust.2022.109134.

Soon, Y., Koh, S.N., and Yeo, C.K., 1998. Noisy speech enhancement using discrete cosine transform. Speech communication, 24(3), pp. 249–257. Doi:10.1016/S0167-6393(98)00019-3.

Upadhyay, N., and Jaiswal, R.K., 2016. Single channel speech enhancement: using Wiener filtering with recursive noise estimation. Procedia Computer Science, 84, pp. 22-30, Doi:10.1016/j.procs.2016.04.061.

Varga, A., Steeneken, H.J.M., and others, 1993. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech communication, 12(3), pp. 247–251. Doi:10.1016/0167-6393(93)90095-3.

Weisstein, E.W., 2002. Normal distribution. From MathWorld. A Wolfram Web Resource. https://mathworld.wolfram.com/NormalDistribution.htm.

Win, H.P.P., and Khine, P.T.T., 2019. Speech enhancement techniques for noisy speech in real world environments. MERAL Portal.

Xia, B., and Bao, C., 2014. Wiener filtering based speech enhancement with weighted denoising auto-encoder and noise classification. Speech Communication, 60, pp. 13–29, Doi:10.1016/j.specom.2014.02.001.

Xiao, B., Lu, G., Zhang, Y., Li, W., and Wang, G., 2016. Lossless image compression based on integer Discrete Tchebichef Transform. Neurocomputing, 214, pp. 587–593, Doi:10.1016/j.neucom.2016.06.050.

Zhang, T., Wang, H., Geng, Y., Zhao, X., and Kong, L., 2023. A speech separation algorithm based on the comb-filter effect. Applied Acoustics, 203, P.109197. Doi:10.1016/j.apacoust.2022.109197.

Similar Articles

You may also start an advanced similarity search for this article.