Incremental k most similar neighbor classifier for mixed data
DOI:
https://doi.org/10.17533/udea.redin.16307Keywords:
supervised classification, incremental algorithms, artificial intelligence, pattern recognitionAbstract
This paper presents an incremental k-most similar neighbor classifier, for mixed data and similarity functions that are not necessarily distances. The algorithm presented is suitable for processing large data sets, because it only stores in main memory the k most similar neighbors processed until step t, traversing only once the training data set. Several experiments with synthetic and real data are presented.
Downloads
References
A. Faragó, T. Linder, G. Lugosi. “Fast nearest-neighbor search in dissimilarity spaces”. IEEE Transactions in Pattern Analysis and Machine Intelligence. Vol. 9. 1993. pp. 957-962. DOI: https://doi.org/10.1109/34.232083
A. Frank, A. Asuncion. UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. 1998.
C. Bohm C. Krebs. “The k-nearest neighbor join: turbo charging the kdd process”. Knowledge Information Systems. Vol. 6. 2004. pp. 728-749. DOI: https://doi.org/10.1007/s10115-003-0122-9
C. Chien, K. Bo, C. Fu. The generalized condensed nearest neighbor rule as a data reduction method. Proc. of the 18th International Conference on Pattern Recognition. Hong Kong, China. 2006. pp. 556-559.
C. Xia, H. Lu, BC. Ooi, J. Hu, Gorder: an efficient method for knn join processing. Proc. of the 30th international conference on very large data bases. Toronto, Canada. 2004. pp. 756-767. DOI: https://doi.org/10.1016/B978-012088469-8/50067-X
C. Yong-Sheng, H. Yi-Ping, F. Chiou-Shann. “Fast and versatile algorithm for nearest neighbor search based on lower bound tree”. Pattern Recognition Letters. Vol. 2. 2007. pp. 360-375. DOI: https://doi.org/10.1016/j.patcog.2005.08.016
C. Yu, B. Cui, S. Wang, J. Su, “Efficient index-based knn join processing for high-dimensional data”. Inf. Softw. Technol. Vol. 4. 2007. pp. 332-344. DOI: https://doi.org/10.1016/j.infsof.2006.05.006
C. Yu, R. Zhang, Y. Huang, H. Xiong, “High-dimensional kNN joins with incremental updates”. Geoinformatica. Nº. 14. 2010. pp. 55-82. DOI: https://doi.org/10.1007/s10707-009-0076-5
H. Chen, B. Yang, G. Wang, J. Liu, X. Xu, S. Wang, D. Liu. “A novel bankruptcy prediction model based on an adaptive fuzzy k-nearest neighbor method”. Knowledge-Based Systems. Vol. 24. 2011. pp. 1348- 1359. DOI: https://doi.org/10.1016/j.knosys.2011.06.008
H. Latifi, F. Fassnacht, B. Koch. “Forest structure modeling with combined airborne hyperspectral and LiDAR data”. Remote Sensing of Environment. Vol. 121. 2012. pp.10-25. DOI: https://doi.org/10.1016/j.rse.2012.01.015
I. Sone, R. Olsen, A. Sivertsen, G. Eilertsen, K. Heia. “Classification of fresh Atlantic salmon (Salmo salar L.) fillets stored under different atmospheres by hyperspectral imaging”. Journal of Food Engineering. 2012. Vol. 109. pp. 482-489. DOI: https://doi.org/10.1016/j.jfoodeng.2011.11.001
J. Breidenbach, E. Nasset, V. Lien, T. Gobakken, S. Solberg. “Prediction of species specific forest inventory attributes using a nonparametric semi-individual tree crown approach based on fused airborne laser scanning and multispectral data”. Remote Sensing of Environment. 2010. Vol. 114. no. 4. pp. 911-924. DOI: https://doi.org/10.1016/j.rse.2009.12.004
J. Friedman, F. Baskett, L. Shustek, “An algorithm for finding nearest neighbors”. IEEE Transactions on Computers. 1975. Vol. C-24. issue 10. pp. 1000-1006. DOI: https://doi.org/10.1109/T-C.1975.224110
J. Ruiz, M. Abidi. “Logical combinatorial pattern recognition: A review”. Ed. Transworld Research Network. Kerala, India. 2002. pp. 133-176.
J. Ruiz. “Pattern recognition with mixed and incomplete data”. Pattern Recognition and Image Analysis. Vol. 18. 2008. pp. 563-576. DOI: https://doi.org/10.1134/S1054661808040044
K. Figueroa, E. Chávez, G. Navarro, R. Paredes. “On the least cost for proximity searching in metric spaces”. Lecture Notes in Computer Science. Vol. 4007. 2006. pp. 279-290. DOI: https://doi.org/10.1007/11764298_26
M. Adler, B. Heeringa. “Search Space Reductions for Nearest-Neighbor Queries”. Lecture Notes in Computer Science. Vol. 4978. 2008. pp. 554-567. DOI: https://doi.org/10.1007/978-3-540-79228-4_48
P. Packalen, M. Maltamo. “The k-MSN method for the prediction of species-specific stand attributes using airborne laser scanning and aerial photographs”. Remote Sensing of Environment. Vol. 109. 3. 2007. pp. 328-341. DOI: https://doi.org/10.1016/j.rse.2007.01.005
R. McRoberts, S. Magnussen, E. Tomppo, G. Chirici. “Parametric, bootstrap, and jackknife variance estimators for the k-Nearest Neighbors technique with illustrations using forest inventory and satellite image data”. Remote Sensing of Environment. Vol. 115. 2011. pp. 3165-3174. DOI: https://doi.org/10.1016/j.rse.2011.07.002
S. Berchtold, D. Keim, H. Kriegel, T. Seidl, “Indexing the solution space: a new technique for nearest neighbor search in high dimensional space”. IEEE Transactions on Knowledge Data Engineering. Vol. 1. 2000. pp. 45-57. DOI: https://doi.org/10.1109/69.842249
S. Hernández, J. Carrasco, J. Martínez. “Fast k Most Similar Neighbor Classifier for Mixed Data Based on Approximating and Eliminating”. Lecture Notes in Computer Science. Vol. 5012. 2008. pp. 697-704.
S. Hernández, J. Martínez, A. Carrasco. “Fast k most similar neighbor classifier for mixed data (tree k-MSN)”. Pattern Recognition. Vol. 43. 3. 2010. pp. 873-886. DOI: https://doi.org/10.1016/j.patcog.2009.08.014
T. Cover, P. Hart, “Nearest neighbor pattern classification”. Transactions on Information Theory. Vol. 13. 1967. pp. 21-27. DOI: https://doi.org/10.1109/TIT.1967.1053964
U. Escobar, G. Sánchez. “Algoritmo de votación incremental INC-ALVOT para clasificación supervisada”. Revista Facultad de Ingeniería, Universidad de Antioquia. Nº. 50. 2009. pp. 195-204.
V. Ramasubramanian, K. Paliwal. “Fast nearest-neighbor search based on approximation-elimination search”. Pattern Recognition. Vol. 33. 2000. pp. 1497- 1510. DOI: https://doi.org/10.1016/S0031-3203(99)00134-X
X. Tian, Z. Su, E. Chen, Z. Li, C. Van der Tol, J. Guo, Q. He. “Estimation of forest above-ground biomass using multi-parameter remote sensing data over a cold and arid area”. Int. Journal of Applied Earth Observation and Geoinformation. Vol. 14. 2012. pp. 160-168. DOI: https://doi.org/10.1016/j.jag.2011.09.010
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2018 Revista Facultad de Ingeniería

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Revista Facultad de Ingeniería, Universidad de Antioquia is licensed under the Creative Commons Attribution BY-NC-SA 4.0 license. https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
The material published in the journal can be distributed, copied and exhibited by third parties if the respective credits are given to the journal. No commercial benefit can be obtained and derivative works must be under the same license terms as the original work.