Machine learning applied to the prediction of diabetes mellitus, using socioeconomic and environmental information from health system users

Authors

DOI:

https://doi.org/10.17533/udea.rfnsp.e351168

Keywords:

machine learning, diabetes mellitus, environmental factors, socioeconomic factors, predictive model

Abstract

Objective: The objective was to apply models based on
machine learning techniques to support the early diagnosis of diabetes mellitus, using environmental, social, economic and health data variables, without dependence on clinical sample collection.

Methodology: Data from 10,889 users affiliated with the subsidized health system in the southwestern area of Colombia, diagnosed with hypertension and grouped into
users without (74.3%) and with (25.7%) diabetes mellitus,
were used. Supervised models were trained using k-nearest
neighbors, decision trees, and random forests, as well as
ensemble-based models, applied to the database before and after balancing the number of cases in each diagnostic group. The performance of the algorithms was evaluated by dividing the database into training and test data (70/30, respectively), and metrics of accuracy, sensitivity, specificity, and area under the curve were used.

Results: Sensitivity values increased significantly when using balanced data, going from maximum values of 17.1% (unbalanced data) to values as high as 57.4% (balanced data). The highest value of area under the curve (0.61) was obtained with the ensemble models, by applying a balance in the amount of data for each group and by coding the categorical variables. The variables with the greatest weight were associated with hereditary aspects (24.65%) and with
the ethnic group (5.59%), in addition to visual difficulty, low
water consumption, a diet low in fruits and vegetables, and
the consumption of salt and sugar.

Conclusions: Although predictive models, using people's socioeconomic and environmental information, emerge as a tool for the early diagnosis of diabetes mellitus, their predictive capacity still needs to be improved.

|Abstract
= 760 veces | PDF (ESPAÑOL (ESPAÑA))
= 653 veces| | PDF INGLÉS (ESPAÑOL (ESPAÑA))
= 29 veces| | HTML (ESPAÑOL (ESPAÑA))
= 7 veces|

Downloads

Download data is not yet available.

References

Howlader KC, Satu MS, Awal MA, et al. Machine learning models for classification and identification of significant attributes to detect type 2 diabetes. Health Inf Sci Syst 2022;10(2). DOI: https://doi.org/10.1007/s13755-021-00168-2

Bernardini D. Sobre los aspectos económicos de la diabetes mellitus. Rev Cubana Aliment Nutr. [internet]. 2022 [citado 2022 ago. 26 ]; 30(Supl. 2):255-61. Disponible en: http://revalnutricion.sld.cu/index.php/rcan/article/view/1226/1701

Organización Mundial de la Salud. Informe mundial sobre la diabetes. Geneva, Switzerland: WHO [internet]; 2016 [citado 2022 ago. 26]. Dispo-nible en: https://apps.who.int/iris/bitstream/handle/10665/254649/9789243565255-spa.pdf

Cuenta de Alto Costo, Fondo Colombiano de Enfermedades de Alto Costo. Situación de la enfermedad renal crónica, la hipertensión arterial y la diabetes mellitus en Colombia 2020. Bogotá [internet]; 2021 [citado 2022 ago. 26]. Disponible en: https://cuentadealtocosto.org/site/publicaciones/situacion-de-la-enfermedad-renal-cronica-la-hipertension-arterial-y-diabetes-mellitus-en-colombia-2020/

Colombia, Ministerio de Salud y Protección Social. Prevenir la diabetes, clave desde los hábitos saludables. [internet]; 2021 [citado 2022 ago. 26]. Disponible en: https://www.minsalud.gov.co/Paginas/Prevenir-la-diabetes-clave-desde-los-habitos-saludables.aspx

Kruczkowski M, Drabik-Kruczkowska A, Marciniak A, et al. Predictions of cervical cancer identification by photonic method combined with machine learning. Sci Rep. 2022;12(1):3762. DOI: https://doi.org/10.1038/s41598-022-07723-1

Hameed Z, Zahia S, Garcia-Zapirain B, et al. Breast cancer histopathology image classification using an ensemble of deep learning models. Sen-sors. 2020;20(16):4373. DOI: https://doi.org/10.3390/s20164373

Konnaris MA, Brendel M, Fontana MA, et al. Computational pathology for musculoskeletal conditions using machine learning: Advances, trends, and challenges. Arthritis Res Ther. 2022;24(1):68. DOI: https://doi.org/10.1186/s13075-021-02716-3

Lee LS, Chan PK, Wen C, et al. Artificial intelligence in diagnosis of knee osteoarthritis and prediction of arthroplasty outcomes: A review. Arth-roplasty. 2022;4(1):16. DOI: https://doi.org/10.1186/s42836-022-00118-7

Lazzarini PA, Raspovic A, Prentice J, et al. Guidelines development protocol and findings: Part of the 2021 Australian evidence-based guidelines for diabetes-related foot disease. J Foot Ankle Res. 2022;28:15. DOI: https://doi.org/10.1186/s13047-022-00533-8

Patel D, Msosa YJ, Wang T, et al. An implementation framework and a feasibility evaluation of a clinical decision support system for diabetes management in secondary mental healthcare using CogStack. BMC Med Inform Decis Mak. 2022;100(1):22. DOI: https://doi.org/10.1186/s12911-022-01842-5

Cerón-Rios GM, Lopez-Gutierrez DM, et al. Recommendation System based on CBR algorithm for the Promotion of Healthier Habits. Sanchez-Ruiz AA, Kofod-Petersen A, editors. Proceedings of ICCBR 2017 Workshops (CAW, CBRDL, PO-CBR), Doctoral Consortium, and Competitions co-located with the 25th International Conference on Case-Based Reasoning (ICCBR 2017). Trondheim, Norway, June 26-28, 2017. CEUR Workshop Proce-edings [internet]; 2017. pp. 167-76 [citado 2022 ago. 26]. Disponible en: https://ceur-ws.org/Vol-2028/paper16.pdf

Li J, Huang J, et al. Application of artificial intelligence in diabetes education and management: Present status and promising prospect. Front Pu-blic Health. 2020;8:173. DOI: https://doi.org/10.3389/fpubh.2020.00173

Rohokale V, Rashmi Neeli, Prassad Ramjee. A cooperative internet of things (IoT) for rural healthcare monitoring and control. 2011 2nd Interna-tional Conference on Wireless Communication, Vehicular Technology, Information Theory and Aerospace & Electronic Systems Technology (Wireless VITAE). 2011; 1-6. DOI: https://doi.org/10.1109/WIRELESSVITAE.2011.5940920

Abbas H, Alic L, Rios M, et al. Predicting diabetes in healthy population through machine learning. In: Proceedings - IEEE Symposium on Compu-ter-Based Medical Systems. Institute of Electrical and Electronics Engineers Inc. [internet]; 2019. pp. 567-70 [citado 2022 ago. 26]. Disponible en: https://ieeexplore.ieee.org/document/8787404

Zhang L, Wang Y, Niu M, et al. Machine learning for characterizing risk of type 2 diabetes mellitus in a rural Chinese population: The Henan Rural Cohort Study. Sci Rep. 2020;4406(1):10. DOI: https://doi.org/10.1038/s41598-020-61123-x

Dinh A, Miertschin S, et al. A data-driven approach to predicting diabetes and cardiovascular disease with machine learning. BMC Med Inform Decis Mak. 2019; 211(1):19. DOI: https://doi.org/10.1186/s12911-019-0918-5

Fazakis N, Kocsis O, Dritsas E, et al. Machine learning tools for long-term type 2 diabetes risk prediction. IEEE Access. 2021;9:103737-57. DOI: https://doi.org/10.1109/ACCESS.2021.3098691

Shetty G, Katkar V. Type-II diabetes detection using decision-tree based ensemble of classifiers. In: 2019 5th International Conference On Com-puting, Communication, Control And Automation (ICCUBEA); 2019. pp. 1-5. DOI: https://doi.org/10.1109/ICCUBEA47591.2019.9129348

Haq AU, Li JP, Khan J, et al. Intelligent machine learning approach for effective recognition of diabetes in e-healthcare using clinical data. Sen-sors. 2020;20(9):2649. DOI: https://doi.org/10.3390/s20092649

Leiva AM, Martínez MA, Petermann F, et al. Factores asociados al desarrollo de diabetes mellitus tipo 2 en Chile. Nutr Hosp. 2018;35(2):400-7. DOI: https://doi.org/10.20960/nh.1434

Géron A. Hands-on machine learning with Scikit-Learn and TensorFlow. CA: O’Reilly Media; 2017. https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/

Priyam A, Abhijeeta, Gupta R, et al. Comparative analysis of decision tree classification algorithms. Int. J. Curr. Eng. Technol. 2013;3(2):334-7. https://inpressco.com/comparative-analysis-of-decision-tree-classification-algorithms/

Published

2023-03-27

How to Cite

1.
Mejía JA, Oviedo-Benálcazar MA, Ordoñez JA, Valencia JF. Machine learning applied to the prediction of diabetes mellitus, using socioeconomic and environmental information from health system users. Rev. Fac. Nac. Salud Pública [Internet]. 2023 Mar. 27 [cited 2025 Feb. 1];41(2):e351168. Available from: https://revistas.udea.edu.co/index.php/fnsp/article/view/351168

Issue

Section

Modelado matemático y simulación

Categories