Skip to content
2000
image of Computational Analysis of Diabetes Kidney Diseases Using Machine Learning

Abstract

The increasing complexity of healthcare, coupled with an ageing population, poses significant challenges for decision-making in healthcare delivery. Implementing smart decision support systems can indeed alleviate some of these challenges by providing clinicians with timely and personalized insights. These systems can leverage vast amounts of patient data, advanced analytics, and predictive modeling to offer clinicians a comprehensive view of individual patient needs and potential outcomes.

Currently, researchers and doctors need a faster solution for various diseases in health care. So they started to use the Machine Learning (ML) algorithms for better solution. ML is a sub field of Artificial Intelligence (AI) that provides a useful tool for data analysis, automatic process and others for healthcare system. The use of ML is increasing continuously in healthcare system due to its learning power.

In this paper the following algorithms are used for the diagnosis of Diabetes and Kidney Disease such as: Gradient Boosting Classifier (GBC), Random Forest Classifier (RFC), Extra Trees Classifier (ETC), Support Vector Classifier (SVC) and Multilayer Perceptron (MNP) Neural Network, In our model, Gradient Boosting Classifier is used with repeated cross validation to develop our system for better results. The experiment analysis performed for both unbalanced and balanced dataset. The accuracy achieved in case of unbalanced and balanced datasets for GBC, ETC, RFC SVC, MLP & DTC are 75.7 & 92.2, 75.7 & 90.1, 74.4 & 80.0, 62.5 & 66.4, 58.3 & 63.0 and 59.4 & 74.5 respectively. On comparing these results, we found that GBC results are better than other algorithms.

Loading

Article metrics loading...

/content/journals/raeeng/10.2174/0123520965316924241007052621
2025-01-06
2025-07-05
Loading full text...

Full text loading...

References

  1. Char D.S. Shah N.H. Magnus D. Implementing machine learning in health care — Addressing ethical challenges. N. Engl. J. Med. 2018 378 11 981 983 10.1056/NEJMp1714229 29539284
    [Google Scholar]
  2. Dua S. Acharya U.R. Dua P. Supervised learning methods for fraud detection in healthcare insurance. Machine learning in healthcare informatics. Berlin Springer Dua S. Acharya U. Dua P. 2014 56 10.1007/978‑3‑642‑40017‑9_12
    [Google Scholar]
  3. Ahamed F. Farid F. 2018 Applying internet of things and machine-learning for personalized healthcare: Issues and challenges. 2018 International Conference on Machine Learning and Data Engineering (iCMLDE) Sydney, NSW, Australia, 03-07 Dec, 2018, pp. 19-21. 10.1109/iCMLDE.2018.00014
    [Google Scholar]
  4. Ghassemi M. Naumann T. Schulam P. Beam A.L. Chen I.Y. Ranganath R. A review of challenges and opportunities in machine learning for health. AMIA Jt. Summits Transl. Sci. Proc. 2020 2020 191 200 32477638
    [Google Scholar]
  5. Naylor C.D. On the prospects for a (deep) learning health care system. JAMA 2018 320 11 1099 1100 10.1001/jama.2018.11103 30178068
    [Google Scholar]
  6. Futoma J. Simons M. Panch T. Doshi-Velez F. Celi L.A. The myth of generalisability in clinical research and machine learning in health care. Lancet Digit. Health 2020 2 9 e489 e492 10.1016/S2589‑7500(20)30186‑2 32864600
    [Google Scholar]
  7. Ball M.J. Lillis J. E-health: Transforming the physician/patient relationship. Int. J. Med. Inform. 2001 61 1 1 10 10.1016/S1386‑5056(00)00130‑1 11248599
    [Google Scholar]
  8. Goodfellow I. Bengio Y. Courville A. Machine learning basics. Deep Learning MIT Press 2016 98 164
    [Google Scholar]
  9. Jordan M.I. Mitchell T.M. Machine learning: Trends, perspectives, and prospects. Science 2015 349 6245 255 260 10.1126/science.aaa8415 26185243
    [Google Scholar]
  10. Chandra G. Dwivedi S.K. 2014 A literature survey on various approaches of word sense disambiguation. 2nd International Symposium on Computational and Business Intelligence New Delhi, India, 07-08 Dec, 2014, pp. 106-109. 10.1109/ISCBI.2014.30
    [Google Scholar]
  11. El Naqa I. Murphy M.J. What is machine learning? Machine learning in radiation oncology. Cham Springer El Naqa I. Li R. Murphy M. 2015 3 11 10.1007/978‑3‑319‑18305‑3_1
    [Google Scholar]
  12. Yang B. Shao Q. Pan L. Li W. A study on regularized weighted least square support vector classifier. Pattern Recognit. Lett. 2018 108 48 55 10.1016/j.patrec.2018.03.002
    [Google Scholar]
  13. Chen M. Hao Y. Hwang K. Wang L. Wang L. Disease prediction by machine learning over big data from healthcare communities. IEEE Access 2017 5 8869 8879 10.1109/ACCESS.2017.2694446
    [Google Scholar]
  14. Kaur G. Chhabra A. Improved J48 classification algorithm for the prediction of diabetes. Int. J. Comput. Appl. 2014 98 22 13 17 10.5120/17314‑7433
    [Google Scholar]
  15. VijayanV V. Ravikumar A. Study of data mining algorithms for prediction and diagnosis of diabetes mellitus. Int. J. Comput. Appl. 2014 95 17 12 16 10.5120/16685‑6801
    [Google Scholar]
  16. Iyer A. S J. Sumbaly R. Diagnosis of diabetes using classification mining techniques. Int. J. Data Min. Knowl. Manage. Process 2015 5 1 01 14 10.5121/ijdkp.2015.5101
    [Google Scholar]
  17. Pandeeswari L. Rajeswari K. Phill M. K-means clustering and Naïve Bayes classifier for categorization of diabetes patients. Eng Technol 2015 2 1 179 185
    [Google Scholar]
  18. Soltani Z. Jafarian A. A new artificial neural networks approach for diagnosing diabetes disease type II. Int. J. Adv. Comput. Sci. Appl. 2016 7 6 89 94 10.14569/IJACSA.2016.070611
    [Google Scholar]
  19. Saravananathan K. Velmurugan T. Analyzing diabetic data using classification algorithms in data mining. Indian J. Sci. Technol. 2016 9 43 1 6 10.17485/ijst/2016/v9i43/93874
    [Google Scholar]
  20. Fu H. Liu S. Bastacky S.I. Wang X. Tian X.J. Zhou D. Diabetic kidney diseases revisited: A new perspective for a new era. Mol. Metab. 2019 30 250 263 10.1016/j.molmet.2019.10.005 31767176
    [Google Scholar]
  21. ElSayed N.A. Aleppo G. Aroda V.R. Bannuru R.R. Brown F.M. Bruemmer D. Collins B.S. Hilliard M.E. Isaacs D. Johnson E.L. Kahan S. Khunti K. Leon J. Lyons S.K. Perry M.L. Prahalad P. Pratley R.E. Seley J.J. Stanton R.C. Gabbay R.A. 11. Chronic kidney disease and risk management: Standards of care in diabetes — 2023. Diabetes Care 2023 46 Suppl. 1 S191 S202 10.2337/dc23‑S011 36507634
    [Google Scholar]
  22. Rayego-Mateos S. Rodrigues-Diez R.R. Fernandez-Fernandez B. Mora-Fernández C. Marchant V. Donate-Correa J. Navarro-González J.F. Ortiz A. Ruiz-Ortega M. Targeting inflammation to treat diabetic kidney disease: The road to 2030. Kidney Int. 2023 103 2 282 296 10.1016/j.kint.2022.10.030 36470394
    [Google Scholar]
  23. Kanasaki K. Ueki K. Nangaku M. Diabetic kidney disease: The kidney disease relevant to individuals with diabetes. Clin. Exp. Nephrol. 2024 10.1007/s10157‑024‑02537‑z 39031296
    [Google Scholar]
  24. Asif M. Al-Razgan M. Ali Y.A. Yunrong L. Graph convolution networks for social media trolls detection use deep feature extraction. J. Cloud Comput. (Heidelb.) 2024 13 1 33 10.1186/s13677‑024‑00600‑4
    [Google Scholar]
  25. Gupta S. Dominguez M. Golestaneh L. Diabetic kidney disease: An update. Med. Clin. North Am. 2023 107 4 689 705 10.1016/j.mcna.2023.03.004 37258007
    [Google Scholar]
  26. Chandra G. Dwivedi S.K. Query expansion for effective retrieval results of hindi–english cross-lingual IR. Appl. Artif. Intell. 2019 33 7 567 593 10.1080/08839514.2019.1577018
    [Google Scholar]
  27. Chu W. Keerthi S.S. Ong C.J. Bayesian trigonometric support vector classifier. Neural Comput. 2003 15 9 2227 2254 10.1162/089976603322297368
    [Google Scholar]
  28. Keerthi S.S. Chapelle O. DeCoste D. Bennett K.P. Parrado-Hernández E. Building support vector machines with reduced classifier complexity. J. Mach. Learn. Res. 2006 7 6 1493 1515
    [Google Scholar]
  29. Pindoriya N.M. Jirutitijaroen P. Srinivasan D. Singh C. Composite reliability evaluation using Monte Carlo simulation and least squares support vector classifier. IEEE Trans. Power Syst. 2011 26 4 2483 2490 10.1109/TPWRS.2011.2116048
    [Google Scholar]
  30. Kaur M. Singh B. 2019 Diagnosis of malignant pleural mesothelioma using KNN. Proceedings of 2nd International Conference on Communication, Computing and Networking Singapore, 08 Sept, 2018, pp. 637-641. 10.1007/978‑981‑13‑1217‑5_62
    [Google Scholar]
  31. Zheng Z. Cai Y. Li Y. Oversampling method for imbalanced classification. Comput. Inf. 2016 34 5 1017 1037
    [Google Scholar]
  32. Drummond C. Holte R.C. C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. International Conference on Machine Learning (ICML 2003) Workshop on Learning from Imbalanced Data Sets II Washington, DC, USA, 31 July, 2003.
    [Google Scholar]
  33. Krawczyk B. Learning from imbalanced data: Open challenges and future directions. Progress in Artificial Intelligence 2016 5 4 221 232 10.1007/s13748‑016‑0094‑0
    [Google Scholar]
  34. Johnson J.M. Khoshgoftaar T.M. Survey on deep learning with class imbalance. J. Big Data 2019 6 1 27 10.1186/s40537‑019‑0192‑5
    [Google Scholar]
  35. Tsuda K. 1999 Support vector classifier with asymmetric kernel functions. 7th European Symposium on Artificial Neural Networks Bruges, Belgium, April 21-23, 1999, pp. 183-188.
    [Google Scholar]
  36. Youyun Z.Y.Z. The study on some problems of support vector classifier. ComEngApp 2003 39 36 38
    [Google Scholar]
  37. Liu Q. He Q. Shi Z. Extreme support vector machine classifier. Advances in Knowledge Discovery and Data Mining, 12th Pacific-Asia Conference, PAKDD Osaka, Japan, May 20-23, 2008, pp. 222-233. 10.1007/978‑3‑540‑68125‑0_21
    [Google Scholar]
  38. Azar A.T. Elshazly H.I. Hassanien A.E. Elkorany A.M. A random forest classifier for lymph diseases. Comput. Methods Programs Biomed. 2014 113 2 465 473 10.1016/j.cmpb.2013.11.004 24290902
    [Google Scholar]
  39. Xu B. Guo X. Ye Y. Cheng J. An improved random forest classifier for text categorization. J. Comput. (Taipei) 2012 7 12 2913 2920
    [Google Scholar]
  40. Nguyen C. Wang Y. Nguyen H.N. Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic. J. Biomed. Sci. Eng. 2013 6 5 551 560 10.4236/jbise.2013.65070
    [Google Scholar]
  41. Devetyarov D. Nouretdinov I. 2010 Prediction with confidence based on a random forest classifier. Artificial Intelligence Applications and Innovations Larnaca, Cyprus, 6-7 Oct, 2010, pp. 37-44 10.1007/978‑3‑642‑16239‑8_8
    [Google Scholar]
  42. Breiman L. Random forests - Random features. Technical Note, University of California 1999
    [Google Scholar]
  43. Livingston F. Implementation of Breiman’s random forest machine learning algorithm . J. Mach. Learn. 2005
    [Google Scholar]
  44. Oshiro T.M. Perez P.S. Baranauskas J.A. 2012 How many trees in a random forest? 8th International workshop on machine learning and data mining in pattern recognition Berlin, Heidelberg, July, 2012, pp. 154-168. 10.1007/978‑3‑642‑31537‑4_13
    [Google Scholar]
  45. Qi Y. Random forest for bioinformatics. Ensemble machine learning. Boston, MA Springer Zhang c. Ma Y. 2012 307 323 10.1007/978‑1‑4419‑9326‑7_11
    [Google Scholar]
  46. Kulkarni V.Y. Sinha P.K. 2012 Pruning of random forest classifiers: A survey and future directions. 2012 International Conference on Data Science & Engineering (ICDSE) Cochin, India, 18-20 July 2012, pp. 64-68. 10.1109/ICDSE.2012.6282329
    [Google Scholar]
  47. Nitze I. Schulthess U. Asche H. 2012 Comparison of machine learning algorithms random forest, artificial neural network and support vector machine to maximum likelihood for supervised crop type classification. Proceedings of the 4th GEOBIA Rio de Janeiro, Brazil, 7-9 May, 2012, pp. 35-40.
    [Google Scholar]
  48. Bhati B.S. Rai C.S. Ensemble based approach for intrusion detection using extra tree classifier. Intelligent Computing in Engineering Singapore Springer Solanki V. Hoang M. Lu Z. Pattnaik P. 213 220 2020 10.1007/978‑981‑15‑2780‑7_25
    [Google Scholar]
  49. Kaur K. Mittal S.K. 2020 Withdrawn: Classification of mammography image with CNN-RNN based semantic features and extra tree classifier approach using LSTM Mater. Today Proc. 10.1016/j.matpr.2020.09.619
    [Google Scholar]
  50. Maier O. Wilms M. von der Gablentz J. Krämer U.M. Münte T.F. Handels H. Extra Tree forests for sub-acute ischemic stroke lesion segmentation in MR sequences. J. Neurosci. Methods 2015 240 89 100 10.1016/j.jneumeth.2014.11.011 25448384
    [Google Scholar]
  51. Shafique R. Mehmood A. Choi G.S. 2019 Cardiovascular disease prediction system using extra trees classifier. Res. Sq. 10.21203/rs.2.14454/v1
    [Google Scholar]
  52. Natekin A. Knoll A. Gradient boosting machines, a tutorial. Front. Neurorobot. 2013 7 21 10.3389/fnbot.2013.00021 24409142
    [Google Scholar]
  53. Dhieb N. Ghazzai H. Besbes H. Massoud Y. 2019 Extreme gradient boosting machine learning algorithm for safe auto insurance operations. 2019 IEEE International Conference on Vehicular Electronics and Safety (ICVES) Cairo, Egypt, 4-6 Sept, 2019, pp. 1-5. 10.1109/ICVES.2019.8906396
    [Google Scholar]
  54. Bansal A. Kaur S. Extreme gradient boosting based tuning for classification in intrusion detection systems. Advances in Computing and Data Sciences Springer Singapore Singh M. Gupta P. Tyagi V. Flusser J. Ören T. 372 380 2018 10.1007/978‑981‑13‑1810‑8_37
    [Google Scholar]
  55. Schapire R.E. The boosting approach to machine learning: An overview. Nonlinear Estimation and Classification New York Springer Denison D.D. Hansen M.H. Holmes C.C. Mallick B. Yu B. 2003 149 171 10.1007/978‑0‑387‑21579‑2_9
    [Google Scholar]
  56. Konstantinov A.V. Utkin L.V. Interpretable machine learning with an ensemble of gradient boosting machines. Knowl. Base. Syst. 2021 222 106993 10.1016/j.knosys.2021.106993
    [Google Scholar]
  57. Sy N.L. Modelling the infiltration process with a multi-layer perceptron artificial neural network. Hydrol. Sci. J. 2006 51 1 3 20 10.1623/hysj.51.1.3
    [Google Scholar]
  58. Pham B.T. Nguyen M.D. Bui K.T.T. Prakash I. Chapi K. Bui D.T. A novel artificial intelligence approach based on multi-layer perceptron neural network and biogeography-based optimization for predicting coefficient of consolidation of soil. Catena 2019 173 302 311 10.1016/j.catena.2018.10.004
    [Google Scholar]
  59. Desai M. Shah M. An anatomization on breast cancer detection and diagnosis employing multi-layer perceptron neural network (MLP) and Convolutional neural network (CNN). Clinical eHealth 2020 1 11 4 10.1016/j.ceh.2020.11.002
    [Google Scholar]
  60. Nasiri V. Darvishsefat A.A. Rafiee R. Shirvany A. Hemat M.A. Land use change modeling through an integrated multi-layer perceptron neural network and markov chain analysis (case study: Arasbaran region, Iran). J. For. Res. 2019 30 3 943 957 10.1007/s11676‑018‑0659‑9
    [Google Scholar]
  61. Moallem P. Razmjooy N. A multi layer perceptron neural network trained by invasive weed optimization for potato color image segmentation. Trends Appl. Sci. Res. 2012 7 6 445 455 10.3923/tasr.2012.445.455
    [Google Scholar]
  62. Murtagh F. Multilayer perceptrons for classification and regression. Neurocomputing 1991 2 5-6 183 197 10.1016/0925‑2312(91)90023‑5
    [Google Scholar]
  63. Chicco D. Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 2020 21 1 6 10.1186/s12864‑019‑6413‑7 31898477
    [Google Scholar]
  64. Boughorbel S. Jarray F. El-Anbari M. Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. PLoS One 2017 12 6 e0177678 10.1371/journal.pone.0177678 28574989
    [Google Scholar]
  65. Chicco D. Tötsch N. Jurman G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min. 2021 14 1 13 10.1186/s13040‑021‑00244‑z 33541410
    [Google Scholar]
  66. Chicco D. Warrens M.J. Jurman G. The Matthews correlation coefficient (MCC) is more informative than Cohen’s Kappa and Brier score in binary classification assessment. IEEE Access 2021 9 78368 78381 10.1109/ACCESS.2021.3084050
    [Google Scholar]
  67. Borges A.M. Kuang J. Milhorn H. Yi R. An alternative approach to calculating Area‐Under‐the‐Curve (AUC) in delay discounting research. J. Exp. Anal. Behav. 2016 106 2 145 155 10.1002/jeab.219 27566660
    [Google Scholar]
  68. Bowers A.J. Zhou X. Receiver operating characteristic (ROC) area under the curve (AUC): A diagnostic measure for evaluating the accuracy of predictors of education outcomes. J. Educ. Students Placed Risk 2019 24 1 20 46 10.1080/10824669.2018.1523734
    [Google Scholar]
  69. Kottas M. Kuss O. Zapf A. A modified Wald interval for the area under the ROC curve (AUC) in diagnostic case-control studies. BMC Med. Res. Methodol. 2014 14 1 26 10.1186/1471‑2288‑14‑26 24552686
    [Google Scholar]
  70. Chandra G. Dwivedi S.K. Query expansion based on term selection for Hindi – English cross lingual IR. Journal of King Saud University - Computer and Information Sciences 2020 32 3 310 319 10.1016/j.jksuci.2017.09.002
    [Google Scholar]
  71. Chandra G. Dwivedi S.K. Assessing query translation quality using back translation in hindi-english clir. Int. J. Intell. Syst. Appl. 2017 9 3 51 59 10.5815/ijisa.2017.03.07
    [Google Scholar]
  72. Zebari R. Abdulazeez A. Zeebaree D. Zebari D. Saeed J. A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. J. Appl. Sci. Technol. Trends 2020 1 1 56 70 10.38094/jastt1224
    [Google Scholar]
/content/journals/raeeng/10.2174/0123520965316924241007052621
Loading
/content/journals/raeeng/10.2174/0123520965316924241007052621
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test