Expert knowledge-guided feature selection for data-based industrial process monitoring
Keywords:processes monitoring, fault detection, fuzzy clustering, feature selection
Indrustial processes are characterized to be in open environments with uncertainty, unpredictability and nonlinear behavior. Rigorous measuring and monitoring is required to strive for product quality, safety and finance. Therefore, data-based monitoring systems have gain interest in academia and industry (e.g. clustering). However industrial processes have high volumes of complex and high dimensional data available, with poorly defined domains and sometimes redundant, noisy or inaccurate measures with unknow parameters. When a mechanistic or structural model is not available or suitable, selecting relevant and informative variables (reducing the high dimensionality) eases pattern recognition to identify functional states of the process. In this paper, we address the feature selection problem in data-based industrial processes monitoring where a mathematical or structural model is not available or suitable. Expert knowledge-quidance is used inside a wrapper feature selction based on clustering. The reduced set of features is capable of represent intrinsic historical-data structure integrating the expert knowledge abput the process. A monitoring system is proposed and tested on an intesification reactor (OPR)', over the thiosulfate and the esterifictation reaction. Results show fewer variables are needed to correctly identify the process functional states.
F. Akbaryan, P. Bishnoi. “Fault diagnosis of multivariate systems using pattern recognition and multisensor data analysis technique”. Computers & Chemical Engineering. Vol. 25. 2001. pp. 1313-1339.
I. Guyon, S. Gunn, M. Nikravesh, L. Zadeh. “Feature Extraction: Foundations and Applications” Studies in Fuzziness and Soft Computing. Vol. 207. pp. 1-22.
D. Aha, R. Bankert. A comparative evaluation of sequential feature selection algorithms. In Proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics. Springer-Verlag. 1995. Fort Lauderdale. USA. pp. 1-7.
K. Muske, C. Georgakis. A methodology for optimal sensor selection in chemical processes. Proc. American Control Conference the 2002. Villanova, Pennsylvania, USA. 2002. pp. 4274-4278.
R. Sikora, S. Piramuthu. “Efficient genetic algorithm based data mining using feature selection with hausdorff distance”. Inf. Tech. and Management. Vol. 6. 2005. pp. 315-331.
L. Fraleigh, M. Guay, J. Forbes. “Sensor selection for model-based real-time optimization: relating design of experiments and design cost”. Journal of Process Control. Vol. 13. 2003. pp. 667-678.
S. Verron, T. Tiplica, A. Kobi. “Fault detection and identification with a new feature selection based on mutual information”. Journal of Process Control. Vol. 18. 2008. pp. 479-490.
M. Bensch, M. Schroder, M. Bogdan, W. Rosenstiel. Feature selection for high-dimensional industrial data. Proceeding of the European Symposium of Artificial Neural Networks. 2005. Brugues, Belgium.pp. 375-380.
T. Kourti. “Process analysis and abnormal situation detection: from theory to practice”. Control Systems Magazine IEEE. Vol. 22. pp. 10-25.
T. Kempowsky. Surveillance de procédées à base de méthodes de classification. Ph.D. dissertation. INSA Toulouse. 2004. pp. 16-20.
P. Domingos. “The role of occam’s razor in knowledge discovery”. Data Mining and Knowledge Discovery. Vol. 3. 1999. pp. 409-425.
T. Cheng, C. Wei, V. Tseng. “Feature selection for medical data mining: Comparisons of expert judgment and automatic approaches”. Computer-Based Medical Systems. 2006. pp. 165-170.
B. Burns, A. Danyluk. “Feature selection vs theory reformulation: A study of genetic refinement of knowledge-based neural networks”. Mach. Learn. Vol. 38. 2000. pp. 89-107.
C. Isaza. Diagnostic par techniques d’apprentissage floues: Conception d’une méthode de validation et d’optimisation des partitions”. Ph.D. dissertation. Laboratoire d’Analyse et d’Architecture des Systèmes du CNRS. Toulouse, France. 2007. pp. 5-23.
L. Prat, A. Devatine, P. Cognet, M. Cabassud, C. Gourdon, S. Elgue, F. Chopard. “Performance evaluation of a novel concept “open plate reactor” applied to highly exothermic reactions”. Chemical Engineering and Technology. Vol. 28. 2005. pp. 1028- 1034.
A. Orantes, T. Kempowsky, M. Lann, L. Prat, S. Elgue, C. Gourdon, M. Cabassud. “Selection of sensors by a new methodology coupling a classification technique and entropy criteria”. Chemical Engineering Research and Design. Vol. 85. 2007. pp. 825-838.
C. Uribe, C. Isaza, O. Gualdron, C. Duran, A. Carvajal, A wrapper approach based on clustering for sensors selection of industrial monitoring systems. Proceedings of the 2010 International Conference on Broadband. Wireless Computing, Communication and Applications. Japan. 2010. pp. 428-487.
I. Guyon, A. Elisseeff. “An introduction to variable and feature selection”. J. Mach. Learn. Res. Vol. 3. 2003. pp. 1157-1182.
S. Guerif, Y. Bennani. “Selection of clusters number and features subset during a two-levels clustering task”. Artificial Intelligence and Soft Computing. 2006. pp. 28-33.
C. Isaza, A. Orantes, T. Kempowsky, M. Le Lann. Contribution of fuzzy classification for the diagnosis of complex systems. The 7th IFAC International Symposium of Fault Detection. Supervision and Safety of Technical Processes. 2009. Barcelona, España. pp. 1132-1137.
T. Kmepowsky, A. Subias, J. Aguilar-Martin. “Process situation assessment: From a fuzzy partition to a finite state machine”. Engineering Applications of Artificial Intelligence. Vol. 19. 2006. pp. 461-477.
J. Aguilar, C. Isaza, E. Diez, M. LeLann, J. Waissman. “Process Monitoring Using Residuals and Fuzzy Classification with Learning Capabilities”. Advances in Soft Computing. Vol. 42. 2007. pp. 275-284
C. Isaza, M. Lann, J. Aguilar, Diagnosis of chemical processes by fuzzy clustering methods: New optimization method of partitions. 18th European Symposium on Computer Aided Process Engineering (ESCAPE 10). 2008. pp. 1-6.
A. Orantes. Methodologie pour le placement des capteurs a base de methodes de classification en vue du diagnostic. Ph.D. dissertation. Laboratoire d’Analyse et d’Architecture des Systemes du CNRS. 2005. pp. 29-39.
J. Aguilar, R. de Mantaras. The process of classification and learning the meaning of linguistic descriptors of concepts. Approximate Reasoning in Decision Analysis. 1982. M.M. Gupta et E. Sanchez (eds.) North Holland. pp. 165-175.
J. Aguado, J. Aguilar. A mixed qualitative-quantitative selflearning classification techniques applied to diagnosis. QR’99 The Thirteenth International Workshop on Qualitative Reasoning. 1999. Loch Awe. pp. 124-128.
X. Nguyen, J. Epps, J. Bailey. Information theoretic measures for clustering comparison: is a correction for chance necessary? ICML. New York, USA. 2009. pp. 135.
R. Mantaras. “A distance-based attribute selection measure for decision tree induction”. Mach. Learn. Vol. 6. 1991. pp. 81-92.
R. Mantaras. Autoapprentissage d’une partition: application au classement iteratif de donnees multidimensionelles. Ph.D. dissertation. Univ. Paul Sabatier. Toulouse. 1979. pp. 20-37.
C. Uribe, C. Isaza. Unsupervised feature selection based on fuzzy partition optimization for industrial processes monitoring. Proccedings of the 2011 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, 2011. Ottawa. pp. 1-5.
How to Cite
Copyright (c) 2018 Revista Facultad de Ingeniería
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Revista Facultad de Ingeniería, Universidad de Antioquia is licensed under the Creative Commons Attribution BY-NC-SA 4.0 license. https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
The material published in the journal can be distributed, copied and exhibited by third parties if the respective credits are given to the journal. No commercial benefit can be obtained and derivative works must be under the same license terms as the original work.