On classification improvement by using an approximate discriminative hidden Markov model
AbstractHMMs are statistical models used in a very successful and effective form in speech recognition. However, HMM is a general model to describe the dynamic of stochastic processes; therefore it can be applied to a huge variety of biomedical signals. Usually, the HMM parameters are estimated by means of MLE (Maximum Likelihood Estimation) criterion. Nevertheless, MLE has as disadvantage that the distribution it is wanted to adjust is the distribution of each class, besides the models and/or data of other classes do not participate in the parameter re-estimation, as a result, the ML criterion is not directly related to reduce the error rate; it has led to many researchers to choice other training techniques known as discriminative training, including maximum mutual information (MMI) estimation. In this work, we carry out an EEG classification in order to compare HMM trained with both ML estimation and MMI estimation. The obtained results show a better performance in all database used.
D. Novak, D. Cuesta-Frau, T. A. Ani, M. Aboy, P. Mico, L. Lhotska. “Speech Recognition Methods Applied to Biomedical Signals Processing.” 26th Annual International Conference of the IEEE. San Francisco (CA). Vol. 1. 2004. pp. 118-121.
R. Solera Ureña, D. M. Iglesias, A. Gallardo, C. Peláez, A. Díaz. “Robust ASR using Support Vector Machines.” Speech Communication. Vol. 49. 2007. pp. 253-267.
L. G. Gamero, R. Watrous. “Detection of the first and second heart sound using probabilistic models.” Proceedings of the 25th Annual International Conference of the IEEE. Cancun (México). Vol. 3. 2003. pp. 2877-2880.
H. Lee, S. Choi. “PCA+HMM+SVM for EEG Pattern Classification.” Seventh International Symposium on Signal Processing and Its Applications. Paris (France). Vol. 1. 2003. pp. 541-544.
C. M. Bishop. Neural Networks for Pattern Recognition. Ed. Oxford University Press. New York. 1995. pp. 1-508.
D. Y. Rubinstein, T. Hastie. “Discriminative vs Informative Learning”. 3rd International Conference on Knowledge Discovery and Data Mining. Newport Beach (CA). 1997. pp. 49-53.
L. R. Rabiner. “A tutorial on Hidden Markov models and selected applications in speech recognition.” IEEE. Vol. 77. 1989. pp. 257-285.
L. Bahl, P. Brown, P. de Souza, R. Mercer. “Maximum mutual information estimation of hidden Markov model parameters for speech recognition”. Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP. Tokyo (Japan). Vol. 11. 1986. pp. 49-52.
B. H. Juang, S. Katagiri. “Discriminative Learning for Minimum Error Classification”. IEEE Transaction on Signal Processing. Vol. 40. 1992. pp. 3043-3054.
A. Nádas. “A decision theoretic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood.” IEEE Trans. Acoust. Speech, Signal Processing. Vol. 31. 1983. pp. 814-817.
A. Cohen. “Hidden Markov models in biomedical signal processing”. 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Hong Kong. Vol. 20. 1998. pp. 1145- 1150.
Y. Normandin, S. D. Morgera. “An improved MMIE training algorithm for speaker-independent, small vocabulary, continuous speech recognition”. Proceedings of the Acoustics, Speech, and Signal Processing. Vol. 1. 1991. pp. 537-540.
D. Burshtein, A. Ben-Yishai. “A Discriminative Training Algorithm for Hidden Markov Models”. IEEE Transactions on Speech and Audio Processing. Vol. 12. 2004. pp. 204-217.
L. D. Avendaño, J. M. Ferrero, G. Castellanos- Dominguez. “Improved Parametric Estimation of Time Frequency Representations for Cardiac Murmur Discrimination”. Computers in Cardiology. Vol. 35. 2008. pp. 157-160.
H. Shino, H. Yoshida, H. Mizuta, K. Yana. “Phonocardiogram classification using time-frequency representation.” 19th International Conference. IEEE/ EMBS. Chicago (IL). 1997. pp. 1636-1673.
W. Haibin, W. Jianqi, L. Guohua, Z. Guohui, N. Ansheng. “Application of adaptive time-frequency analysis in cardiac murmurs signal processing.” Proceedings of the 23rd Annual International Conference of the IEEE. Istanbul (Turkey). Vol. 4. 2001. pp. 1896-1898.
G. Daza-Santacoloma, J. D. Arias-Londoño, J. I. Godino, G. Castellanos-Dominguez, V. Osma, N. Saenz. “Dynamic feature extraction: an application to voice pathology detection.” Intelligent Automation and Soft Computing. Vol. 15. 2009. pp. 665-680.
L. Rankinea, M. Mesbaha, B. Boashash. “IF estimation for multicomponent signals using image processing techniques in the time frequency domain.” Signal Processing. Vol. 87. 2007. pp. 1234-1250.
A. Acero, X. Huang. Spoken Language Processing. Ed. Prentice Hall. New Jersey. 2001. pp. 1-1008.
G. de Krom. “A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals”. Journal of Speech and Hearing Research. Vol. 36. 1993. pp. 254-266.
D. Michaelis, T. Gramms, H. W. Strube. “Glottal to Noise Excitation ratio - a new measure for describing pathological voices.” Acta Acustica united with Acustica. Vol. 83. 1997. pp. 700-706.
H. Kasuya, S. Ogawa, K. Mashima, S. Ebihara. “Normalized noise energy as an acoustic measure to evaluate pathologic voice”. Acoustical Society of America. Vol. 80. 1986. pp. 1329-1334.
Copyright (c) 2018 Revista Facultad de Ingeniería
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors can archive the pre-print version (i.e., the version prior to peer review) and post-print version (that is, the final version after peer review and layout process) on their personal website, institutional repository and / or thematic repository
- Upon acceptance of an article, it will be published online through the page https://revistas.udea.edu.co/index.php/ingenieria/issue/archive in PDF version with its correspondent DOI identifier
The Revista Facultad de Ingeniería -redin- encourages the Political Constitution of Colombia, chapter IV
Chapter IV Sanctions 51
The following shall be liable to imprisonment for two to five years and a fine of five to 20 times the legal minimum monthly wage: (1) any person who publishes an unpublished literary or artistic work, or part thereof, by any means, without the express prior authorization of the owner of rights; (2) any person who enters in the National Register of Copyright a literary, scientific or artistic work in the name of a person other than the true author, or with its title altered or deleted, or with its text altered, deformed, amended or distorted, or with a false mention of the name of the publisher or phonogram, film, videogram or software producer; (3) any person who in any way or by any means reproduces, disposes of, condenses, mutilates or otherwise transforms a literary, scientific or artistic work without the express prior authorization of the owners thereof; (4) any person who reproduces phonograms, videograms, software or cinematographic works without the express prior authorization of the owner, or transports, stores, stocks, distributes, imports, sells, offers for sale, acquires for sale or distribution or in any way deals in such reproductions. Paragraph. If either the material embodiment or title page of or the introduction to the literary work, phonogram, videogram, software or cinematographic work uses the name, business style, logotype or distinctive mark of the lawful owner of rights, the foregoing sanctions shall be increased by up to half.