Skip to content
2000
image of An Efficient Approach for Diabetes Classification Using Feature Selection and Hyperparameter Tuning

Abstract

Background

Diabetes mellitus, stemming from insulin deficiency or resistance, poses acute and chronic health issues driven by factors like age, obesity, genetics, and lifestyle. It significantly impacts health, leading to conditions like heart disease, vision problems, and kidney dysfunction, with a notable mortality rate reported by the WHO in 2019. The modern diet has escalated diabetes risk. Machine learning techniques play a pivotal role in disease prediction, aiding timely interventions.

Objective

The primary aim of this research work is to explore and contrast the effectiveness of various existing machine-learning models for diabetes disease classification. The goal is to identify the optimal solution that yields the highest accuracy.

Methods

In the initial phase, we implemented data pre-processing, followed by the application of a diverse range of machine learning methods to classify diabetes mellitus. Subsequently, a comprehensive analysis was conducted on machine learning algorithms, considering both the complete dataset features and those selected through Particle Swarm Optimization (PSO). The assessment covered various metrics such as accuracy score, precision, F1 score, and log loss for Support Vector Classifier (SVC), K-Nearest Neighbours (KNN), Random Forest (RF), ADA Boost, XG Boost, Extra Tree, and Decision Tree. Ultimately, the introduction of hyperparameter tuning was aimed at enhancing performance and attaining the highest level of accuracy.

Results

The proposed model HSVC combines the Particle Swarm Optimization (PSO) feature selection strategy with optimized hyperparameters, showcasing outstanding performance and achieving an accuracy of 98.66%.

Conclusion

The models developed in this study can potentially be applied or recommended for the classification of other health conditions in different domains, such as Parkinson’s disease, heart disease, and many more.

Loading

Article metrics loading...

/content/journals/raeeng/10.2174/0123520965291885240315051751
2024-04-01
2025-04-23
Loading full text...

Full text loading...

References

  1. Qin H. Chen Z. Zhang Y. Wang L. Triglyceride to high-density lipoprotein cholesterol ratio is associated with incident diabetes in men: A retrospective study of Chinese individuals. J. Diabetes Investig. 2019 1 7 10.1111/jdi.13087
    [Google Scholar]
  2. Mahboob Alam T. Iqbal M.A. Ali Y. Wahab A. Ijaz S. Imtiaz Baig T. Hussain A. Malik M.A. Raza M.M. Ibrar S. Abbas Z. A model for early prediction of diabetes. Informatics in Medicine Unlocked 2019 16 January 100204 10.1016/j.imu.2019.100204
    [Google Scholar]
  3. Faruque F. Performance analysis of machine learning techniques to predict diabetes mellitus 2019 International Conference on Electricall, Computer and Communication Engineering (ECCE) 07- 09 February 2019, Cox'sBazar, Bangladesh, 2019 1 4 10.1109/ECACE.2019.8679365
    [Google Scholar]
  4. Kumari S. Kumar D. Mittal M. An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier Intern. J. cog. comput. eng. 2021 2 40 46 10.1016/j.ijcce.2021.01.001
    [Google Scholar]
  5. Maniruzzaman M. Rahman M.J. Ahammed B. Abedin M.M. Classification and prediction of diabetes disease using machine learning paradigm. Health Inf. Sci. Syst. 2020 8 1 7 10.1007/s13755‑019‑0095‑z 31949894
    [Google Scholar]
  6. Butt U.M. Machine learning based diabetes classification and prediction for healthcare applications. J. Healthc. Eng. 2021 2021 9930985
    [Google Scholar]
  7. Singh N. Singh P. Stacking-based multi-objective evolutionary ensemble framework for prediction of diabetes mellitus. Biocybern. Biomed. Eng. 2020 40 1 1 22 10.1016/j.bbe.2019.10.001
    [Google Scholar]
  8. Sowah R.A. Bampoe-Addo A.A. Armoo S.K. Saalia F.K. Gatsi F. Sarkodie-Mensah B. Design and development of diabetes management system using machine learning. Int. J. Telemed. Appl. 2020 2020 1 17 10.1155/2020/8870141 32724304
    [Google Scholar]
  9. Hasan M.K. Alam M.A. Das D. Hossain E. Hasan M. Diabetes prediction using ensembling of different machine learning classifiers. IEEE Access 2020 8 76516 76531 10.1109/ACCESS.2020.2989857
    [Google Scholar]
  10. Padhy S. Dash S. Routray S. Ahmad S. Nazeer J. Alam A. IoT-based hybrid ensemble machine learning model for efficient diabetes mellitus prediction. Comput. Intell. Neurosci. 2022 10.1155/2022/2389636
    [Google Scholar]
  11. Swapna G. Vinayakumar R. Soman K.P. Diabetes detection using deep learning algorithms. ICT Express 2018 4 4 243 246 10.1016/j.icte.2018.10.005
    [Google Scholar]
  12. Wu H. Yang S. Huang Z. He J. Wang X. Type 2 diabetes mellitus prediction model based on data mining. Info. Med. Unlocked 2018 10 100 107 10.1016/j.imu.2017.12.006
    [Google Scholar]
  13. Alehegn M. Joshi R. Mulay P. Analysis and prediction of diabetes mellitus using machine learning algorithm. Intern. J. Pure Appl. Mathe. 2018 118 9 871 878
    [Google Scholar]
  14. Nibareke T. Laassiri J. Using Big Data-machine learning models for diabetes prediction and flight delays analytics. J. Big Data 2020 7 1 78 10.1186/s40537‑020‑00355‑0
    [Google Scholar]
  15. Al-Shareeda M.A. Manickam S. COVID-19 vehicle based on an efficient mutual authentication scheme for 5g-enabled vehicular fog computing. Int. J. Environ. Res. Public Health 2022 19 23 15618 10.3390/ijerph192315618 36497709
    [Google Scholar]
  16. Al-Shareeda M.A. Anbar M. Manickam S. Hasbullah I.H. SE-CPPA: A secure and efficient conditional privacy-preserving authentication scheme in vehicular ad-hoc networks. Sensors 2021 21 24 8206 10.3390/s21248206 34960311
    [Google Scholar]
  17. Al-Shareeda M.A. Anbar M. Manickam S. Hasbullah I.H. Towards identity-based conditional privacy-preserving authentication scheme for vehicular ad hoc networks. IEEE Access 2021 9 113226 113238 10.1109/ACCESS.2021.3104148
    [Google Scholar]
  18. Mohammed B.A. Al-Shareeda M.A. Manickam S. Al-Mekhlafi Z.G. Alreshidi A. Alazmi M. Alshudukhi J.S. Alsaffar M. FC-PA: Fog computing-based pseudonym authentication scheme in 5g-enabled vehicular networks. IEEE Access 2023 11 18571 18581 10.1109/ACCESS.2023.3247222
    [Google Scholar]
  19. Al-Shareeda M.A. Manickam S. MSR-DoS: Modular square root-based scheme to resist denial of service (DoS) attacks in 5g-enabled vehicular networks. IEEE Access 2022 10 120606 120615 10.1109/ACCESS.2022.3222488
    [Google Scholar]
  20. Thakkar H. Shah V. Yagnik H. Shah M. Comparative anatomization of data mining and fuzzy logic techniques used in diabetes prognosis Clinical eHealth 2021 4 12 23 10.1016/j.ceh.2020.11.001
    [Google Scholar]
  21. Torkey H. Ibrahim E. Hemdan E.Z.Z.E.D. El-Sayed A. Shouman M.A. Diabetes classification application with efficient missing and outliers data handling algorithms. Complex & Intelligent Systems 2022 8 1 237 253 10.1007/s40747‑021‑00349‑2
    [Google Scholar]
  22. Choubey D.K. Kumar P. Tripathi S. Kumar S. Performance evaluation of classification methods with PCA and PSO for diabetes. Netw. Model. Anal. Health Inform. Bioinform. 2020 9 1 5 10.1007/s13721‑019‑0210‑8
    [Google Scholar]
  23. Maulidina F. Rustam Z. Hartini S. Wibowo V.V.P. Wirasati I. Sadewo W. Feature optimization using backward elimination and support vector machines (SVM) algorithm for diabetes classification. J. Phys. Conf. Ser. 2021 1821 1 012006 10.1088/1742‑6596/1821/1/012006
    [Google Scholar]
  24. Wang X. Zhai M. Ren Z. Ren H. Li M. Quan D. Chen L. Qiu L. Exploratory study on classification of diabetes mellitus through a combined Random Forest Classifier. BMC Med. Inform. Decis. Mak. 2021 21 1 105 10.1186/s12911‑021‑01471‑4 33743696
    [Google Scholar]
  25. Salem H. Shams M.Y. Elzeki O.M. Elfattah M.A. Al‐amri J.F. Elnazer S. Fine‐tuning fuzzy KNN classifier based on uncertainty membership for the medical diagnosis of diabetes. Appl. Sci. 2022 12 3 1 26 10.3390/app12030950
    [Google Scholar]
  26. Azad C. Bhushan B. Sharma R. Shankar A. Singh K.K. Khamparia A. Prediction model using SMOTE, genetic algorithm and decision tree (PMSGD) for classification of diabetes mellitus. Multimedia Syst. 2022 28 4 1289 1307 10.1007/s00530‑021‑00817‑2
    [Google Scholar]
  27. Mishra S. Tripathy H.K. Mallick P.K. EAGA-MLP: An enhanced and adaptive hybrid classification model for diabetes diagnosis. Sensors 2020 20 14 4036
    [Google Scholar]
  28. Battineni G. Sagaro G.G. Nalini C. Amenta F. Tayebati S.K. Comparative machine-learning approach: A follow-up study on type 2 diabetes predictions by cross-validation methods. Machines 2019 7 4 74 10.3390/machines7040074
    [Google Scholar]
  29. Yang L. Shami A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020 415 295 316 10.1016/j.neucom.2020.07.061
    [Google Scholar]
  30. Reddy S.K. Krishnaveni T. Nikitha G. Vijaykanth E. Diabetes prediction using different machine learning algorithms Third International Conference on Inventive Research in Computing Applications (ICIRCA) 2021 1261 1265 10.1109/ICIRCA51532.2021.9544593
    [Google Scholar]
/content/journals/raeeng/10.2174/0123520965291885240315051751
Loading
/content/journals/raeeng/10.2174/0123520965291885240315051751
Loading

Data & Media loading...


  • Article Type:
    Research Article
Keywords: Diabetes ; machine learning ; feature selection ; support vector machine ; hyperparameter ; P.S.O
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test