A harmony search algorithm for clustering with feature selection
DOI:
https://doi.org/10.17533/udea.redin.14724Keywords:
harmony search, clustering, feature selectionAbstract
This paper presents a new clustering algorithm, called IHSK, with feature selection in a linear order of complexity. The algorithm is based on the combination of the harmony search and K-means algorithms. Feature selection uses both the concept of variability and a heuristic method that penalizes the presence of dimensions with a low probability of contributing to the current solution. The algorithm was tested with sets of synthetic and real data, obtaining promising results.
Downloads
References
A. K. Jain, M. N. Murty, P. J. Flynn. “Data clustering: a review”. ACM Comput. Surv. Vol. 31. 1999. pp. 264- 323. DOI: https://doi.org/10.1145/331499.331504
K. Jacob, N. Charles, T. Marc. Grouping Multidimensional Data Recent Advances in Clustering. Ed. Springer-Verlag. New York. 2006. pp. 25-72.
J. Dy, G.C.E. Brodley, J. Mach. “Feature Selection for Unsupervised Learning”. Learn. Res. Vol.5. 2004. pp. 845-889.
Z. Geem, J. Kim, G.V. Loganathan. “A New Heuristic Optimization Algorithm”. Harmony Search Simulation. Vol.76. 2001. pp. 60-68. DOI: https://doi.org/10.1177/003754970107600201
M. G. H Omran, M. Mahdavi. “Global-best harmony search”. Applied Mathematics and Computation, Vol. 198. 2008. pp. 643-656. DOI: https://doi.org/10.1016/j.amc.2007.09.004
M. Mahdavi, M. Fesanghary, E. Damangir. “An improved harmony search algorithm for solving optimization problems”. Applied Mathematics and Computation. Vol. 188. 2007. pp. 1567-1579. DOI: https://doi.org/10.1016/j.amc.2006.11.033
S. J. Redmondand, C. Heneghan. “A method for initialising the K-means clustering algorithm using kd-trees”. Pattern Recognition Letters. Vol. 28. 2007. pp. 965-973. DOI: https://doi.org/10.1016/j.patrec.2007.01.001
A. K Jain, R.C. Dubes. Algorithms for clustering data. Ed. Prentice-Hall Inc. Englewood Cliffs (NJ.). 1988. pp.143-222.
A. Webb. Statistical Pattern Recognition. 2ª ed. Ed. John Wiley & Sons. Malvern (UK) 2002. pp. 361- 408. DOI: https://doi.org/10.1002/0470854774
A. L. Blum, P. Langley. “Selection of relevant features and examples in machine learning”. Artificial Intelligence. Vol. 97. 1997. pp. 245-271. DOI: https://doi.org/10.1016/S0004-3702(97)00063-5
K. Ron, H. J. George. “Wrappers for feature subset selection”. Artif. Intell. Vol. 97. 1997. pp. 273-324. DOI: https://doi.org/10.1016/S0004-3702(97)00043-X
H. Zeng, Y. M. Cheung. “A new feature selection method for Gaussian mixture clustering”. Pattern Recognition. Vol. 42. 2009. pp. 243-250. DOI: https://doi.org/10.1016/j.patcog.2008.05.030
S. Osiński, J. Stefanowski, D. Weiss. “Lingo search results clustering algorithm based on Singular Value Decomposition”. International Conference on Intelligent Information Systems (IIPWM). Zakapore (Poland). 2004. pp. 359-397. DOI: https://doi.org/10.1007/978-3-540-39985-8_37
J. Han, M. Kamber. Data Mining Concepts and Techniques. 2ª ed. Ed.Morgan Kaufmann Publishers. 2006.pp.71-72.
S. Weiguo, L. Xiaohui, M. Fairhurst. “A Niching Memetic Algorithm for Simultaneous Clustering and Feature Selection”. IEEE Transactions on Knowledge and Data Engineering. Vol. 20. 2008. pp. 868-879. DOI: https://doi.org/10.1109/TKDE.2008.33
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2018 Revista Facultad de Ingeniería
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Revista Facultad de Ingeniería, Universidad de Antioquia is licensed under the Creative Commons Attribution BY-NC-SA 4.0 license. https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
The material published in the journal can be distributed, copied and exhibited by third parties if the respective credits are given to the journal. No commercial benefit can be obtained and derivative works must be under the same license terms as the original work.