Expert knowledge-guided feature selection for data-based industrial process monitoring.
AbstractIndrustial processes are characterized to be in open environments with uncertainty, unpredictability and nonlinear behavior. Rigorous measuring and monitoring is required to strive for product quality, safety and finance. Therefore, data-based monitoring systems have gain interest in academia and industry (e.g. clustering). However industrial processes have high volumes of complex and high dimensional data available, with poorly defined domains and sometimes redundant, noisy or inaccurate measures with unknow parameters. When a mechanistic or structural model is not available or suitable, selecting relevant and informative variables (reducing the high dimensionality) eases pattern recognition to identify functional states of the process. In this paper, we address the feature selection problem in data-based industrial processes monitoring where a mathematical or structural model is not available or suitable. Expert knowledge-quidance is used inside a wrapper feature selction based on clustering. The reduced set of features is capable of represent intrinsic historical-data structure integrating the expert knowledge abput the process. A monitoring system is proposed and tested on an intesification reactor (OPR)', over the thiosulfate and the esterifictation reaction. Results show fewer variables are needed to correctly identify the process functional states.
F. Akbaryan, P. Bishnoi. “Fault diagnosis of multivariate systems using pattern recognition and multisensor data analysis technique”. Computers & Chemical Engineering. Vol. 25. 2001. pp. 1313-1339.
I. Guyon, S. Gunn, M. Nikravesh, L. Zadeh. “Feature Extraction: Foundations and Applications” Studies in Fuzziness and Soft Computing. Vol. 207. pp. 1-22.
D. Aha, R. Bankert. A comparative evaluation of sequential feature selection algorithms. In Proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics. Springer-Verlag. 1995. Fort Lauderdale. USA. pp. 1-7.
K. Muske, C. Georgakis. A methodology for optimal sensor selection in chemical processes. Proc. American Control Conference the 2002. Villanova, Pennsylvania, USA. 2002. pp. 4274-4278.
R. Sikora, S. Piramuthu. “Efficient genetic algorithm based data mining using feature selection with hausdorff distance”. Inf. Tech. and Management. Vol. 6. 2005. pp. 315-331.
L. Fraleigh, M. Guay, J. Forbes. “Sensor selection for model-based real-time optimization: relating design of experiments and design cost”. Journal of Process Control. Vol. 13. 2003. pp. 667-678.
S. Verron, T. Tiplica, A. Kobi. “Fault detection and identification with a new feature selection based on mutual information”. Journal of Process Control. Vol. 18. 2008. pp. 479-490.
M. Bensch, M. Schroder, M. Bogdan, W. Rosenstiel. Feature selection for high-dimensional industrial data. Proceeding of the European Symposium of Artificial Neural Networks. 2005. Brugues, Belgium.pp. 375-380.
T. Kourti. “Process analysis and abnormal situation detection: from theory to practice”. Control Systems Magazine IEEE. Vol. 22. pp. 10-25.
T. Kempowsky. Surveillance de procédées à base de méthodes de classification. Ph.D. dissertation. INSA Toulouse. 2004. pp. 16-20.
P. Domingos. “The role of occam’s razor in knowledge discovery”. Data Mining and Knowledge Discovery. Vol. 3. 1999. pp. 409-425.
T. Cheng, C. Wei, V. Tseng. “Feature selection for medical data mining: Comparisons of expert judgment and automatic approaches”. Computer-Based Medical Systems. 2006. pp. 165-170.
B. Burns, A. Danyluk. “Feature selection vs theory reformulation: A study of genetic refinement of knowledge-based neural networks”. Mach. Learn. Vol. 38. 2000. pp. 89-107.
C. Isaza. Diagnostic par techniques d’apprentissage floues: Conception d’une méthode de validation et d’optimisation des partitions”. Ph.D. dissertation. Laboratoire d’Analyse et d’Architecture des Systèmes du CNRS. Toulouse, France. 2007. pp. 5-23.
L. Prat, A. Devatine, P. Cognet, M. Cabassud, C. Gourdon, S. Elgue, F. Chopard. “Performance evaluation of a novel concept “open plate reactor” applied to highly exothermic reactions”. Chemical Engineering and Technology. Vol. 28. 2005. pp. 1028- 1034.
A. Orantes, T. Kempowsky, M. Lann, L. Prat, S. Elgue, C. Gourdon, M. Cabassud. “Selection of sensors by a new methodology coupling a classification technique and entropy criteria”. Chemical Engineering Research and Design. Vol. 85. 2007. pp. 825-838.
C. Uribe, C. Isaza, O. Gualdron, C. Duran, A. Carvajal, A wrapper approach based on clustering for sensors selection of industrial monitoring systems. Proceedings of the 2010 International Conference on Broadband. Wireless Computing, Communication and Applications. Japan. 2010. pp. 428-487.
I. Guyon, A. Elisseeff. “An introduction to variable and feature selection”. J. Mach. Learn. Res. Vol. 3. 2003. pp. 1157-1182.
S. Guerif, Y. Bennani. “Selection of clusters number and features subset during a two-levels clustering task”. Artificial Intelligence and Soft Computing. 2006. pp. 28-33.
C. Isaza, A. Orantes, T. Kempowsky, M. Le Lann. Contribution of fuzzy classification for the diagnosis of complex systems. The 7th IFAC International Symposium of Fault Detection. Supervision and Safety of Technical Processes. 2009. Barcelona, España. pp. 1132-1137.
T. Kmepowsky, A. Subias, J. Aguilar-Martin. “Process situation assessment: From a fuzzy partition to a finite state machine”. Engineering Applications of Artificial Intelligence. Vol. 19. 2006. pp. 461-477.
J. Aguilar, C. Isaza, E. Diez, M. LeLann, J. Waissman. “Process Monitoring Using Residuals and Fuzzy Classification with Learning Capabilities”. Advances in Soft Computing. Vol. 42. 2007. pp. 275-284
C. Isaza, M. Lann, J. Aguilar, Diagnosis of chemical processes by fuzzy clustering methods: New optimization method of partitions. 18th European Symposium on Computer Aided Process Engineering (ESCAPE 10). 2008. pp. 1-6.
A. Orantes. Methodologie pour le placement des capteurs a base de methodes de classification en vue du diagnostic. Ph.D. dissertation. Laboratoire d’Analyse et d’Architecture des Systemes du CNRS. 2005. pp. 29-39.
J. Aguilar, R. de Mantaras. The process of classification and learning the meaning of linguistic descriptors of concepts. Approximate Reasoning in Decision Analysis. 1982. M.M. Gupta et E. Sanchez (eds.) North Holland. pp. 165-175.
J. Aguado, J. Aguilar. A mixed qualitative-quantitative selflearning classification techniques applied to diagnosis. QR’99 The Thirteenth International Workshop on Qualitative Reasoning. 1999. Loch Awe. pp. 124-128.
X. Nguyen, J. Epps, J. Bailey. Information theoretic measures for clustering comparison: is a correction for chance necessary? ICML. New York, USA. 2009. pp. 135.
R. Mantaras. “A distance-based attribute selection measure for decision tree induction”. Mach. Learn. Vol. 6. 1991. pp. 81-92.
R. Mantaras. Autoapprentissage d’une partition: application au classement iteratif de donnees multidimensionelles. Ph.D. dissertation. Univ. Paul Sabatier. Toulouse. 1979. pp. 20-37.
C. Uribe, C. Isaza. Unsupervised feature selection based on fuzzy partition optimization for industrial processes monitoring. Proccedings of the 2011 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, 2011. Ottawa. pp. 1-5.
Copyright (c) 2018 Revista Facultad de Ingeniería
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors can archive the pre-print version (i.e., the version prior to peer review) and post-print version (that is, the final version after peer review and layout process) on their personal website, institutional repository and / or thematic repository
- Upon acceptance of an article, it will be published online through the page https://revistas.udea.edu.co/index.php/ingenieria/issue/archive in PDF version with its correspondent DOI identifier
The Revista Facultad de Ingeniería -redin- encourages the Political Constitution of Colombia, chapter IV
Chapter IV Sanctions 51
The following shall be liable to imprisonment for two to five years and a fine of five to 20 times the legal minimum monthly wage: (1) any person who publishes an unpublished literary or artistic work, or part thereof, by any means, without the express prior authorization of the owner of rights; (2) any person who enters in the National Register of Copyright a literary, scientific or artistic work in the name of a person other than the true author, or with its title altered or deleted, or with its text altered, deformed, amended or distorted, or with a false mention of the name of the publisher or phonogram, film, videogram or software producer; (3) any person who in any way or by any means reproduces, disposes of, condenses, mutilates or otherwise transforms a literary, scientific or artistic work without the express prior authorization of the owners thereof; (4) any person who reproduces phonograms, videograms, software or cinematographic works without the express prior authorization of the owner, or transports, stores, stocks, distributes, imports, sells, offers for sale, acquires for sale or distribution or in any way deals in such reproductions. Paragraph. If either the material embodiment or title page of or the introduction to the literary work, phonogram, videogram, software or cinematographic work uses the name, business style, logotype or distinctive mark of the lawful owner of rights, the foregoing sanctions shall be increased by up to half.