Skip to content
2000
Volume 20, Issue 4
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Enhancers are the short functional regions (50–1500bp) in the genome, which play an effective character in activating gene-transcription in the presence of transcription-factors. Many human diseases, such as cancer and inflammatory bowel disease, are correlated with the enhancers’ genetic variations. The precise recognition of the enhancers provides useful insights for understanding the pathogenesis of human diseases and their treatments. High-throughput experiments are considered essential tools for characterizing enhancers; however, these methods are laborious, costly and time-consuming. Computational methods are considered alternative solutions for accurate and rapid identification of the enhancers. Over the past years, numerous computational predictors have been devised for predicting enhancers and their strength. A comprehensive review and thorough assessment are indispensable to systematically compare sequence-based enhancer’s bioinformatics tools on their performance. Giving the increasing interest in this domain, we conducted a large-scale analysis and assessment of the state-of-the-art enhancer predictors to evaluate their scalability and generalization power. Additionally, we classified the existing approaches into three main groups: conventional machine-learning, ensemble and deep learning-based approaches. Furthermore, the study has focused on exploring the important factors that are crucial for developing precise and reliable predictors such as designing trusted benchmark/independent datasets, feature representation schemes, feature selection methods, classification strategies, evaluation metrics and webservers. Finally, the insights from this review are expected to provide important guidelines to the research community and pharmaceutical companies in general and high-throughput tools for the detection and characterization of enhancers in particular.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/0115748936285942240513064919
2024-06-04
2025-04-22
Loading full text...

Full text loading...

References

  1. YangR. WuF. ZhangC. ZhangL. iEnhancer-GAN: A deep learning framework in combination with word embedding and sequence generative adversarial net to identify enhancers and their strength.Int. J. Mol. Sci.2021227358910.3390/ijms2207358933808317
    [Google Scholar]
  2. LiuB. FangL. LongR. LanX. ChouK.C. iEnhancer-2L: A two-layer predictor for identifying enhancers and their strength by pseudo k -tuple nucleotide composition.Bioinformatics201632336236910.1093/bioinformatics/btv60426476782
    [Google Scholar]
  3. CohnD. ZukO. KaplanT. Enhancer identification using transfer and adversarial deep learning of DNA sequences.BioRxiv201810.1101/264200
    [Google Scholar]
  4. TahirM. HayatM. KabirM. Sequence based predictor for discrimination of enhancer and their types by applying general form of Chou’s trinucleotide composition.Comput. Methods Programs Biomed.2017146697510.1016/j.cmpb.2017.05.00828688491
    [Google Scholar]
  5. HeW. JiaC. EnhancerPred2.0: Predicting enhancers and their strength based on position-specific trinucleotide propensity and electron–ion interaction potential feature selection.Mol. Biosyst.201713476777410.1039/C7MB00054E28239713
    [Google Scholar]
  6. MinX. DeepEnhancer: Predicting enhancers by convolutional neural networks.2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)15-18 December 2016Shenzhen201610.1109/BIBM.2016.7822593
    [Google Scholar]
  7. JiaC. HeW. EnhancerPred: A predictor for discovering enhancers based on the combination and selection of multiple features.Sci. Rep.2016613874110.1038/srep3874127941893
    [Google Scholar]
  8. MinX. ZengW. ChenS. ChenN. ChenT. JiangR. Predicting enhancers with deep convolutional neural networks.BMC Bioinformatics201718S13Suppl. 1347810.1186/s12859‑017‑1878‑329219068
    [Google Scholar]
  9. GhandiM. Mohammad-NooriM. GhareghaniN. LeeD. GarrawayL. BeerM.A. gkmSVM: An R package for gapped-kmer SVM.Bioinformatics201632142205220710.1093/bioinformatics/btw20327153639
    [Google Scholar]
  10. LeN.Q.K. YappE.K.Y. HoQ.T. NagasundaramN. OuY.Y. YehH.Y. iEnhancer-5Step: Identifying enhancers using hidden information of DNA sequences via Chou’s 5-step rule and word embedding.Anal. Biochem.2019571536110.1016/j.ab.2019.02.01730822398
    [Google Scholar]
  11. YangH. WangS. Identifying enhancers and their strength based on PCWM feature by a two-layer predictor.Fifth Int. Conf. Bio. Inform. Biomed. Eng.2021291810.1145/3469678.3469707
    [Google Scholar]
  12. KhanZ.U. PiD. YaoS. NawazA. AliF. AliS. piEnPred: A bi-layered discriminative model for enhancers and their subtypes via novel cascade multi-level subset feature selection algorithm.Front. Comput. Sci.202115615690410.1007/s11704‑020‑9504‑3
    [Google Scholar]
  13. LimD.Y. KhanalJ. TayaraH. ChongK.T. iEnhancer-RF: Identifying enhancers and their strength by enhanced feature representation using random forest.Chemom. Intell. Lab. Syst.202121210428410.1016/j.chemolab.2021.104284
    [Google Scholar]
  14. LiangY. ZhangS. QiaoH. ChengY. iEnhancer-MFGBDT: Identifying enhancers and their strength by fusing multiple features and gradient boosting decision tree.Math. Biosci. Eng.20211868797881410.3934/mbe.202143434814323
    [Google Scholar]
  15. LiuB. iEnhancer-PsedeKNC: Identification of enhancers and their subgroups based on Pseudo degenerate kmer nucleotide composition.Neurocomputing2016217465210.1016/j.neucom.2015.12.138
    [Google Scholar]
  16. LyuY. ZhangZ. LiJ. HeW. DingY. GuoF. iEnhancer-KL: A novel two-layer predictor for identifying enhancers by position specific of nucleotide composition.IEEE/ACM Trans. Comput. Biol. Bioinformatics20211862809281510.1109/TCBB.2021.305360833481715
    [Google Scholar]
  17. LiuB. LiK. HuangD.S. ChouK.C. iEnhancer-EL: Identifying enhancers and their strength with ensemble learning approach.Bioinformatics201834223835384210.1093/bioinformatics/bty45829878118
    [Google Scholar]
  18. CaiL. RenX. FuX. PengL. GaoM. ZengX. iEnhancer-XG: Interpretable sequence-based enhancers and their strength predictor.Bioinformatics20213781060106710.1093/bioinformatics/btaa91433119044
    [Google Scholar]
  19. NiuK. LuoX. ZhangS. TengZ. ZhangT. ZhaoY. iEnhancer-EBLSTM: Identifying enhancers and strengths by ensembles of bidirectional long short-term memory.Front. Genet.20211266549810.3389/fgene.2021.66549833833783
    [Google Scholar]
  20. TanK.K. LeN.Q.K. YehH.Y. ChuaM.C.H. Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties.Cells20198776710.3390/cells807076731340596
    [Google Scholar]
  21. BasithS. HasanM.M. LeeG. WeiL. ManavalanB. Integrative machine learning framework for the identification of cell-specific enhancers from the human genome.Brief. Bioinform.2021226bbab25210.1093/bib/bbab25234226917
    [Google Scholar]
  22. WangC. ZouQ. JuY. ShiH. Enhancer-FRL: improved and robust identification of enhancers and their activities using feature representation learning.IEEE/ACM Trans. Comput. Biol. Bioinformatics202320296797510.1109/TCBB.2022.320436536063523
    [Google Scholar]
  23. NguyenQ.H. Nguyen-VoT.H. LeN.Q.K. DoT.T.T. RahardjaS. NguyenB.P. iEnhancer-ECNN: Identifying enhancers and their strength using ensembles of convolutional neural networks.BMC Genomics201920S9Suppl. 995110.1186/s12864‑019‑6336‑331874637
    [Google Scholar]
  24. LiQ. Identification and classification of enhancers using dimension reduction technique and recurrent neural network.Comput Math Methods Med.20202020885225810.1155/2020/8852258
    [Google Scholar]
  25. AsimM.N. Enhancer-DSNet: A supervisedly prepared enriched sequence representation for the identification of enhancers and their strength.Neural Information ProcessingBerlin, HeidelbergSpringer.202010.1007/978‑3‑030‑63836‑8_4
    [Google Scholar]
  26. MuX. WangY. DuanM. LiuS. LiF. WangX. ZhangK. HuangL. ZhouF. A novel position-specific encoding algorithm (SeqPose) of nucleotide sequences and its application for detecting enhancers.Int. J. Mol. Sci.2021226307910.3390/ijms2206307933802922
    [Google Scholar]
  27. YangH. WangS. XiaX. iEnhancer-RD: Identification of enhancers and their strength using RKPK features and deep neural networks.Anal. Biochem.202163011431810.1016/j.ab.2021.11431834364858
    [Google Scholar]
  28. LeN.Q.K. HoQ.T. NguyenT.T.D. OuY.Y. A transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers from sequence information.Brief. Bioinform.2021225bbab00510.1093/bib/bbab00533539511
    [Google Scholar]
  29. InayatN. KhanM. IqbalN. KhanS. RazaM. KhanD.M. KhanA. WeiD.Q. iEnhancer-DHF: Identification of enhancers and their strengths using optimize deep neural network with multiple features extraction methods.IEEE Access20219407834079610.1109/ACCESS.2021.3062291
    [Google Scholar]
  30. ZhangT.H. FloresM. HuangY. ES-ARCNN: Predicting enhancer strength by using data augmentation and residual convolutional neural network.Anal. Biochem.202161811412010.1016/j.ab.2021.11412033535061
    [Google Scholar]
  31. WangW. WuQ. LiC. iEnhancer-DCSA: Identifying enhancers via dual-scale convolution and spatial attention.BMC Genomics202324139310.1186/s12864‑023‑09468‑137442977
    [Google Scholar]
  32. JiaJ. LeiR. QinL. WuG. WeiX. iEnhancer-DCSV: Predicting enhancers and their strength based on DenseNet and improved convolutional block attention module.Front. Genet.202314113201810.3389/fgene.2023.113201836936423
    [Google Scholar]
  33. LiJ. WuZ. LinW. LuoJ. ZhangJ. ChenQ. ChenJ. iEnhancer-ELM: Improve enhancer identification by extracting position-related multiscale contextual information based on enhancer language models.Bioinform. Adv.202331vbad04310.1093/bioadv/vbad04337113248
    [Google Scholar]
  34. WuH. LiuM. ZhangP. ZhangH. iEnhancer-SKNN: A stacking ensemble learning-based method for enhancer identification and classification using sequence information.Brief. Funct. Genomics202322330231110.1093/bfgp/elac05736715222
    [Google Scholar]
  35. GillM. AhmedS. KabirM. HayatM. A novel predictor for the analysis and prediction of enhancers and their strength via multi-view features and deep forest.Information (Basel)2023141263610.3390/info14120636
    [Google Scholar]
  36. LiY. KongF. CuiH. WangF. LiC. MaJ. SENIES: DNA shape enhanced two-layer deep learning predictor for the identification of enhancers and their strength.IEEE/ACM Trans. Comput. Biol. Bioinformatics202320163764535015646
    [Google Scholar]
  37. HuangG. LuoW. ZhangG. ZhengP. YaoY. LyuJ. LiuY. WeiD.Q. Enhancer-LSTMAtt: A Bi-LSTM and attention-based deep learning method for enhancer recognition.Biomolecules202212799510.3390/biom1207099535883552
    [Google Scholar]
  38. LiaoM. ZhaoJ. TianJ. ZhengC.H. iEnhancer-DCLA: Using the original sequence to identify enhancers and their strength based on a deep learning framework.BMC Bioinformatics202223148010.1186/s12859‑022‑05033‑x36376800
    [Google Scholar]
  39. KamranH. TahirM. TayaraH. ChongK.T. ienhancer-deep: A computational predictor for enhancer sites and their strength using deep learning.Appl. Sci. (Basel)2022124212010.3390/app12042120
    [Google Scholar]
  40. ZengL. LiuY. YuZ.G. LiuY. iEnhancer-DLRA: Identification of enhancers and their strengths by a self-attention fusion strategy for local and global features.Brief. Funct. Genomics202221539940710.1093/bfgp/elac02335942693
    [Google Scholar]
  41. XiaoZ. WangL. DingY. YuL. iEnhancer-MRBF: Identifying enhancers and their strength with a multiple Laplacian-regularized radial basis function network.Methods20222081810.1016/j.ymeth.2022.10.00136220606
    [Google Scholar]
  42. KabirM. NantasenamatC. KanthawongS. CharoenkwanP. ShoombuatongW. Large-scale comparative review and assessment of computational methods for phage virion proteins identification.EXCLI J.202221112935145365
    [Google Scholar]
  43. AhmedS. ArifM. KabirM. KhanK. KhanY.D. PredAoDP: Accurate identification of antioxidant proteins by fusing different descriptors based on evolutionary information with support vector machine.Chemom. Intell. Lab. Syst.202222810462310.1016/j.chemolab.2022.104623
    [Google Scholar]
  44. TahaA.A. HanburyA. Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool.BMC Med. Imaging20151512910.1186/s12880‑015‑0068‑x26263899
    [Google Scholar]
  45. ManavalanB. ShinT.H. LeeG. PVP-SVM: Sequence-based prediction of phage virion proteins using a support vector machine.Front. Microbiol.2018947610.3389/fmicb.2018.0047629616000
    [Google Scholar]
  46. PanY. GaoH. LinH. LiuZ. TangL. LiS. Identification of bacteriophage virion proteins using multinomial naive Bayes with g-gap feature tree.Int. J. Mol. Sci.2018196177910.3390/ijms1906177929914091
    [Google Scholar]
  47. ArifM. AhmedS. GeF. KabirM. KhanY.D. YuD-J. ThafarM. StackACPred: Prediction of anticancer peptides by integrating optimized multiple feature descriptors with stacked ensemble approach.Chemom. Intell. Lab. Syst.202222010445810.1016/j.chemolab.2021.104458
    [Google Scholar]
  48. ArifM. FangG. FidaH. MuslehS. YuD.J. AlamT. iMRSAPred: Improved prediction of anti-MRSA peptides using physicochemical and pairwise contact-energy properties of amino acids.ACS Omega2024922874288310.1021/acsomega.3c0830338250405
    [Google Scholar]
  49. HasanM.M. SchaduangratN. BasithS. LeeG. ShoombuatongW. ManavalanB. HLPpred-Fuse: Improved and robust prediction of hemolytic peptide and its activity by fusing multiple feature representation.Bioinformatics202036113350335610.1093/bioinformatics/btaa16032145017
    [Google Scholar]
  50. ShoombuatongW. MekhaP. ChaijaruwanichJ. Sequence based human leukocyte antigen gene prediction using informative physicochemical properties.Int. J. Data Min. Bioinform.201513321122410.1504/IJDMB.2015.07207226547977
    [Google Scholar]
  51. ShoombuatongW. BasithS. PittiT. LeeG. ManavalanB. THRONE: A new approach for accurate prediction of human RNA N7-methylguanosine sites.J. Mol. Biol.20224341116754910.1016/j.jmb.2022.16754935662472
    [Google Scholar]
  52. CharoenkwanP. AhmedS. NantasenamatC. QuinnJ.M.W. MoniM.A. Lio’P. ShoombuatongW. AMYPred-FRL is a novel approach for accurate prediction of amyloid proteins by using feature representation learning.Sci. Rep.2022121769710.1038/s41598‑022‑11897‑z35546347
    [Google Scholar]
  53. ArifM. KabirM. AhmedS. KhanA. GeF. KhelifiA. YuD.J. DeepCPPred: A deep learning framework for the discrimination of cell-penetrating peptides and their uptake efficiencies.IEEE/ACM Trans. Comput. Biol. Bioinformatics20221952749275910.1109/TCBB.2021.310213334347603
    [Google Scholar]
  54. QiangX. ZhouC. YeX. DuP.F. SuR. WeiL. CPPred-FL: A sequence-based predictor for large-scale identification of cell-penetrating peptides by feature representation learning.Brief. Bioinform.2020211112330239616
    [Google Scholar]
  55. YanZ. GeF. LiuY. ZhangY. LiF. SongJ. YuD.J. TransEFVP: A two-stage approach for the prediction of human pathogenic variants based on protein sequence embedding fusion.J. Chem. Inf. Model.20246441407141810.1021/acs.jcim.3c0201938334115
    [Google Scholar]
  56. AhmedS. KabirM. ArifM. KhanZ.U. YuD.J. DeepPPSite: A deep learning-based model for analysis and prediction of phosphorylation sites using efficient sequence information.Anal. Biochem.202161211395510.1016/j.ab.2020.11395532949607
    [Google Scholar]
  57. SulemanM.T. KhanY.D. PseU-Pred: An ensemble model for accurate identification of pseudouridine sites.Anal. Biochem.202367611524710.1016/j.ab.2023.11524737437648
    [Google Scholar]
  58. LiuZ. YuD.J. cpxDeepMSA: A deep cascade algorithm for constructing multiple sequence alignments of protein–protein interactions.Int. J. Mol. Sci.20222315845910.3390/ijms2315845935955594
    [Google Scholar]
  59. GuoY. LiuS. LiZ. ShangX. BCDForest: A boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data.BMC Bioinformatics201819S5Suppl. 511810.1186/s12859‑018‑2095‑429671390
    [Google Scholar]
  60. CaiJ. LuoJ. WangS. YangS. Feature selection in machine learning: A new perspective.Neurocomputing2018300707910.1016/j.neucom.2017.11.077
    [Google Scholar]
  61. GuyonI. ElisseeffA. An introduction to variable and feature selection.J. Mach. Learn. Res.20033Mar11571182
    [Google Scholar]
  62. AhmedS. KabirM. AliZ. ArifM. AliF. YuD.J. An integrated feature selection algorithm for cancer classification using gene expression data.Comb. Chem. High Throughput Screen.201921963164510.2174/138620732266618122012475630569852
    [Google Scholar]
  63. AhmedS. KabirM. ArifM. AliZ. Khan SwatiZ.N. Prediction of human phosphorylated proteins by extracting multi-perspective discriminative features from the evolutionary profile and physicochemical properties through LFDA.Chemom. Intell. Lab. Syst.202020310406610.1016/j.chemolab.2020.104066
    [Google Scholar]
  64. MahmudS.M.H. GohK.O.M. HosenM.F. NandiD. ShoombuatongW. Deep-WET: A deep learning-based approach for predicting DNA-binding proteins using word embedding techniques with weighted features.Sci. Rep.2024141296110.1038/s41598‑024‑52653‑938316843
    [Google Scholar]
  65. PerveenG. AlturiseF. AlkhalifahT. Daanial KhanY. Hemolytic-Pred: A machine learning-based predictor for hemolytic proteins using position and composition-based features.Digit. Health202392055207623118073910.1177/2055207623118073937434723
    [Google Scholar]
/content/journals/cbio/10.2174/0115748936285942240513064919
Loading
/content/journals/cbio/10.2174/0115748936285942240513064919
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test