Skip to content
2000
Volume 19, Issue 6
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Background: Krüppel-like factors (KLFs) are a family of transcription factors containing zinc fingers that regulate various cellular processes. KLF proteins are associated with human diseases, such as cancer, cardiovascular diseases, and metabolic disorders. The KLF family consists of 18 members with diverse expression profiles across numerous tissues. Accurate identification and annotation of KLF proteins is crucial, given their involvement in important biological functions. Although experimental approaches can identify KLF proteins precisely, large-scale identification is complicated, slow, and expensive. Methods: In this study, we developed RDR100, a novel random forest (RF)-based framework for predicting KLF proteins based on their primary sequences. First, we identified the optimal encodings for ten different features using a recursive feature elimination approach, and then trained their respective model using five distinct machine learning (ML) classifiers. Results: The performance of all models was assessed using independent datasets, and RDR100 was selected as the final model based on its consistent performance in cross-validation and independent evaluation. Conclusion: Our results demonstrate that RDR100 is a robust predictor of KLF proteins. RDR100 web server is available at https://procarb.org/RDR100/.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/1574893618666230905102407
2024-07-01
2025-06-18
Loading full text...

Full text loading...

/content/journals/cbio/10.2174/1574893618666230905102407
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test