On classification improvement by using an approximate discriminative hidden Markov model

Johanna Carvajal González; Milton Sarria Paja; Germán Castellanos Domínguez

doi:10.17533/udea.redin.14726

Authors

Johanna Carvajal González National University of Colombia
Milton Sarria Paja National University of Colombia
Germán Castellanos Domínguez National University of Colombia

DOI:

https://doi.org/10.17533/udea.redin.14726

Keywords:

hidden Markov models, discriminative training, MMI, biosignals

Abstract

HMMs are statistical models used in a very successful and effective form in speech recognition. However, HMM is a general model to describe the dynamic of stochastic processes; therefore it can be applied to a huge variety of biomedical signals. Usually, the HMM parameters are estimated by means of MLE (Maximum Likelihood Estimation) criterion. Nevertheless, MLE has as disadvantage that the distribution it is wanted to adjust is the distribution of each class, besides the models and/or data of other classes do not participate in the parameter re-estimation, as a result, the ML criterion is not directly related to reduce the error rate; it has led to many researchers to choice other training techniques known as discriminative training, including maximum mutual information (MMI) estimation. In this work, we carry out an EEG classification in order to compare HMM trained with both ML estimation and MMI estimation. The obtained results show a better performance in all database used.

|Abstract

= 98 veces | PDF (ESPAÑOL (ESPAÑA))

= 48 veces|

Downloads

Download data is not yet available.

Author Biographies

Johanna Carvajal González, National University of Colombia

Digital Signal Processing and Control Group.

Milton Sarria Paja, National University of Colombia

Digital Signal Processing and Control Group.

Germán Castellanos Domínguez, National University of Colombia

Digital Signal Processing and Control Group.

References

D. Novak, D. Cuesta-Frau, T. A. Ani, M. Aboy, P. Mico, L. Lhotska. “Speech Recognition Methods Applied to Biomedical Signals Processing.” 26th Annual International Conference of the IEEE. San Francisco (CA). Vol. 1. 2004. pp. 118-121.

R. Solera Ureña, D. M. Iglesias, A. Gallardo, C. Peláez, A. Díaz. “Robust ASR using Support Vector Machines.” Speech Communication. Vol. 49. 2007. pp. 253-267. DOI: https://doi.org/10.1016/j.specom.2007.01.013

L. G. Gamero, R. Watrous. “Detection of the first and second heart sound using probabilistic models.” Proceedings of the 25th Annual International Conference of the IEEE. Cancun (México). Vol. 3. 2003. pp. 2877-2880.

H. Lee, S. Choi. “PCA+HMM+SVM for EEG Pattern Classification.” Seventh International Symposium on Signal Processing and Its Applications. Paris (France). Vol. 1. 2003. pp. 541-544.

C. M. Bishop. Neural Networks for Pattern Recognition. Ed. Oxford University Press. New York. 1995. pp. 1-508. DOI: https://doi.org/10.1093/oso/9780198538493.003.0001

D. Y. Rubinstein, T. Hastie. “Discriminative vs Informative Learning”. 3rd International Conference on Knowledge Discovery and Data Mining. Newport Beach (CA). 1997. pp. 49-53.

L. R. Rabiner. “A tutorial on Hidden Markov models and selected applications in speech recognition.” IEEE. Vol. 77. 1989. pp. 257-285. DOI: https://doi.org/10.1109/5.18626

L. Bahl, P. Brown, P. de Souza, R. Mercer. “Maximum mutual information estimation of hidden Markov model parameters for speech recognition”. Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP. Tokyo (Japan). Vol. 11. 1986. pp. 49-52.

B. H. Juang, S. Katagiri. “Discriminative Learning for Minimum Error Classification”. IEEE Transaction on Signal Processing. Vol. 40. 1992. pp. 3043-3054. DOI: https://doi.org/10.1109/78.175747

A. Nádas. “A decision theoretic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood.” IEEE Trans. Acoust. Speech, Signal Processing. Vol. 31. 1983. pp. 814-817. DOI: https://doi.org/10.1109/TASSP.1983.1164173

A. Cohen. “Hidden Markov models in biomedical signal processing”. 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Hong Kong. Vol. 20. 1998. pp. 1145- 1150.

Y. Normandin, S. D. Morgera. “An improved MMIE training algorithm for speaker-independent, small vocabulary, continuous speech recognition”. Proceedings of the Acoustics, Speech, and Signal Processing. Vol. 1. 1991. pp. 537-540. DOI: https://doi.org/10.1109/ICASSP.1991.150395

D. Burshtein, A. Ben-Yishai. “A Discriminative Training Algorithm for Hidden Markov Models”. IEEE Transactions on Speech and Audio Processing. Vol. 12. 2004. pp. 204-217. DOI: https://doi.org/10.1109/TSA.2003.822639

L. D. Avendaño, J. M. Ferrero, G. Castellanos- Dominguez. “Improved Parametric Estimation of Time Frequency Representations for Cardiac Murmur Discrimination”. Computers in Cardiology. Vol. 35. 2008. pp. 157-160.

H. Shino, H. Yoshida, H. Mizuta, K. Yana. “Phonocardiogram classification using time-frequency representation.” 19th International Conference. IEEE/ EMBS. Chicago (IL). 1997. pp. 1636-1673.

W. Haibin, W. Jianqi, L. Guohua, Z. Guohui, N. Ansheng. “Application of adaptive time-frequency analysis in cardiac murmurs signal processing.” Proceedings of the 23rd Annual International Conference of the IEEE. Istanbul (Turkey). Vol. 4. 2001. pp. 1896-1898.

G. Daza-Santacoloma, J. D. Arias-Londoño, J. I. Godino, G. Castellanos-Dominguez, V. Osma, N. Saenz. “Dynamic feature extraction: an application to voice pathology detection.” Intelligent Automation and Soft Computing. Vol. 15. 2009. pp. 665-680.

L. Rankinea, M. Mesbaha, B. Boashash. “IF estimation for multicomponent signals using image processing techniques in the time frequency domain.” Signal Processing. Vol. 87. 2007. pp. 1234-1250. DOI: https://doi.org/10.1016/j.sigpro.2006.10.013

A. Acero, X. Huang. Spoken Language Processing. Ed. Prentice Hall. New Jersey. 2001. pp. 1-1008.

G. de Krom. “A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals”. Journal of Speech and Hearing Research. Vol. 36. 1993. pp. 254-266. DOI: https://doi.org/10.1044/jshr.3602.254

D. Michaelis, T. Gramms, H. W. Strube. “Glottal to Noise Excitation ratio - a new measure for describing pathological voices.” Acta Acustica united with Acustica. Vol. 83. 1997. pp. 700-706.

H. Kasuya, S. Ogawa, K. Mashima, S. Ebihara. “Normalized noise energy as an acoustic measure to evaluate pathologic voice”. Acoustical Society of America. Vol. 80. 1986. pp. 1329-1334. DOI: https://doi.org/10.1121/1.394384