Skip to content
2000
Volume 32, Issue 10
  • ISSN: 0929-8673
  • E-ISSN: 1875-533X

Abstract

Background

The novel coronavirus pneumonia (COVID-19) outbreak in late 2019 killed millions worldwide. Coronaviruses cause diseases such as severe acute respiratory syndrome (SARS-CoV) and SARS-CoV-2. Many peptides in the host defense system have antiviral activity. How to establish a set of efficient models to identify anti-coronavirus peptides is a meaningful study.

Methods

Given this, a new prediction model EACVP is proposed. This model uses the evolutionary scale language model (ESM-2 LM) to characterize peptide sequence information. The ESM model is a natural language processing model trained by machine learning technology. It is trained on a highly diverse and dense dataset (UR50/D 2021_04) and uses the pre-trained language model to obtain peptide sequence features with 320 dimensions. Compared with traditional feature extraction methods, the information represented by ESM-2 LM is more comprehensive and stable. Then, the features are input into the convolutional neural network (CNN), and the convolutional block attention module (CBAM) lightweight attention module is used to perform attention operations on CNN in space dimension and channel dimension. To verify the rationality of the model structure, we performed ablation experiments on the benchmark and independent test datasets. We compared the EACVP with existing methods on the independent test dataset.

Results

Experimental results show that ACC, F1-score, and MCC are 3.95%, 35.65% and 0.0725 higher than the most advanced methods, respectively. At the same time, we tested EACVP on ENNAVIA-C and ENNAVIA-D data sets, and the results showed that EACVP has good migration and is a powerful tool for predicting anti-coronavirus peptides.

Conclusion

The results prove that this model EACVP could fully characterize the peptide information and achieve high prediction accuracy. It can be generalized to different data sets. The data and code of the article have been uploaded to https://github.com/JYY625/EACVP.git.

Loading

Article metrics loading...

/content/journals/cmc/10.2174/0109298673287899240303164403
2024-03-15
2025-04-01
Loading full text...

Full text loading...

References

  1. SakaiT. MorimotoY. The history of infectious diseases and medicine.Pathogens20221110114710.3390/pathogens1110114736297204
    [Google Scholar]
  2. WuZ. McGooganJ.M. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China.JAMA2020323131239124210.1001/jama.2020.264832091533
    [Google Scholar]
  3. PerlmanS. Another decade, another coronavirus.N. Engl. J. Med.2020382876076210.1056/NEJMe200112631978944
    [Google Scholar]
  4. GorbalenyaA.E. BakerS.C. BaricR.S. de GrootR.J. DrostenC. GulyaevaA.A. HaagmansB.L. LauberC. LeontovichA.M. NeumanB.W. PenzarD. PerlmanS. PoonL.L.M. SamborskiyD.V. SidorovI.A. SolaI. ZiebuhrJ. Coronaviridae Study Group of the International Committee on Taxonomy of Viruses The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2.Nat. Microbiol.20205453654410.1038/s41564‑020‑0695‑z32123347
    [Google Scholar]
  5. AdamsJ. Agyenkwa-MawuliK. AgyapongO. WilsonM.D. KwofieS.K. EBOLApred: A machine learning-based web application for predicting cell entry inhibitors of the Ebola virus.Comput. Biol. Chem.202210110776610.1016/j.compbiolchem.2022.10776636088668
    [Google Scholar]
  6. ShahP.S. LinkN. JangG.M. SharpP.P. ZhuT. SwaneyD.L. JohnsonJ.R. Von DollenJ. RamageH.R. SatkampL. NewtonB. HüttenhainR. PetitM.J. BaumT. EverittA. LaufmanO. TassettoM. ShalesM. StevensonE. IglesiasG.N. ShokatL. TripathiS. BalasubramaniamV. WebbL.G. AguirreS. WillseyA.J. Garcia-SastreA. PollardK.S. CherryS. GamarnikA.V. MarazziI. TauntonJ. Fernandez-SesmaA. BellenH.J. AndinoR. KroganN.J. Comparative flavivirus-host protein interaction mapping reveals mechanisms of dengue and zika virus pathogenesis.Cell2018175719311945.e1810.1016/j.cell.2018.11.02830550790
    [Google Scholar]
  7. TanX. DSHP: A novel sequence based deep learning prediction model for hpv integration site2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)Las Vegas, NV, USA,202256156610.1109/BIBM55620.2022.9995413
    [Google Scholar]
  8. ElnagdyS. AlKhazindarM. The potential of antimicrobial peptides as an antiviral therapy against COVID-19.ACS Pharmacol. Transl. Sci.20203478078210.1021/acsptsci.0c0005932821884
    [Google Scholar]
  9. ChenS LiaoY ZhaoJ BinY ZhengC. Prediction of anti-coronavirus peptides using a stacking learning strategy with effective feature representation.IEEE/ACM Trans. Comput. Biol. Bioinform202320531063116
    [Google Scholar]
  10. QinS. XuW. WangC. JiangS. DaiW. YangY. ShenJ. JinP. MaF. XiaX. Analyzing master regulators and scRNA-seq of COVID-19 patients reveals an underlying anti-SARS-CoV-2 mechanism of ZNF proteins.Brief. Bioinform.2021225bbab11810.1093/bib/bbab11833907801
    [Google Scholar]
  11. LiR. WuK. LiY. LiangX. LaiK.P. ChenJ. Integrative pharmacological mechanism of vitamin C combined with glycyrrhizic acid against COVID-19: Findings of bioinformatics analyses.Brief. Bioinform.20212221161117410.1093/bib/bbaa14132662814
    [Google Scholar]
  12. LiJ. PuY. TangJ. ZouQ. GuoF. DeepAVP: A dual-channel deep neural network for identifying variable-length antiviral peptides.IEEE J. Biomed. Health Inform.202024103012301910.1109/JBHI.2020.297709132142462
    [Google Scholar]
  13. SharmaR. ShrivastavaS. SinghS.K. KumarA. SinghA.K. SaxenaS. Deep-AVPpred: Artificial intelligence driven discovery of peptide drugs for viral infections.IEEE J. Biomed. Health Inform.202226105067507410.1109/JBHI.2021.313082534822333
    [Google Scholar]
  14. ZhaoH. ZhouJ. ZhangK. ChuH. LiuD. PoonV.K.M. ChanC.C.S. LeungH.C. FaiN. LinY.P. ZhangA.J.X. JinD.Y. YuenK.Y. ZhengB.J. A novel peptide with potent and broad-spectrum antiviral activities against multiple respiratory viruses.Sci. Rep.2016612200810.1038/srep2200826911565
    [Google Scholar]
  15. ZhaoH. ToK.K.W. SzeK.H. YungT.T.M. BianM. LamH. YeungM.L. LiC. ChuH. YuenK.Y. A broad-spectrum virus and host-targeting peptide against respiratory viruses including influenza virus and SARS-CoV-2.Nat. Commun.2020111425210.1038/s41467‑020‑17986‑932843628
    [Google Scholar]
  16. XiaS. ZhuY. LiuM. LanQ. XuW. WuY. YingT. LiuS. ShiZ. JiangS. LuL. Fusion mechanism of 2019-nCoV and fusion inhibitors targeting HR1 domain in spike protein.Cell. Mol. Immunol.202017776576710.1038/s41423‑020‑0374‑232047258
    [Google Scholar]
  17. LuL. LiuQ. ZhuY. ChanK.H. QinL. LiY. WangQ. ChanJ.F.W. DuL. YuF. MaC. YeS. YuenK.Y. ZhangR. JiangS. Structure-based discovery of Middle East respiratory syndrome coronavirus fusion inhibitor.Nat. Commun.201451306710.1038/ncomms406724473083
    [Google Scholar]
  18. LauJ.L. DunnM.K. Therapeutic peptides: Historical perspectives, current development trends, and future directions.Bioorg. Med. Chem.201826102700270710.1016/j.bmc.2017.06.05228720325
    [Google Scholar]
  19. QureshiA. ThakurN. TandonH. KumarM. AVPdb: A database of experimentally validated antiviral peptides targeting medically important viruses.Nucleic Acids Res.201442D1D1147D115310.1093/nar/gkt119124285301
    [Google Scholar]
  20. PirtskhalavaM. AmstrongA.A. GrigolavaM. ChubinidzeM. AlimbarashviliE. VishnepolskyB. GabrielianA. RosenthalA. HurtD.E. TartakovskyM. DBAASP v3: Database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics.Nucleic Acids Res.202149D1D288D29710.1093/nar/gkaa99133151284
    [Google Scholar]
  21. WaghuF.H. GopiL. BaraiR.S. RamtekeP. NizamiB. Idicula-ThomasS. CAMP: Collection of sequences and structures of antimicrobial peptides.Nucleic Acids Res.201442D1D1154D115810.1093/nar/gkt115724265220
    [Google Scholar]
  22. WangG. LiX. WangZ. APD3: The antimicrobial peptide database as a tool for research and education.Nucleic Acids Res.201644D1D1087D109310.1093/nar/gkv127826602694
    [Google Scholar]
  23. ThakurN. QureshiA. KumarM. AVPpred: Collection and prediction of highly effective antiviral peptides.Nucleic Acids Res.201240W1W199W20410.1093/nar/gks45022638580
    [Google Scholar]
  24. Beltrán LissabetJ.F. BelénL.H. FariasJ.G. AntiVPP 1.0: A portable tool for prediction of antiviral peptides.Comput. Biol. Med.201910712713010.1016/j.compbiomed.2019.02.01130802694
    [Google Scholar]
  25. SchaduangratN. NantasenamatC. PrachayasittikulV. ShoombuatongW. Meta-iAVP: A sequence-based meta-predictor for improving the prediction of antiviral peptides using effective feature representation.Int. J. Mol. Sci.20192022574310.3390/ijms2022574331731751
    [Google Scholar]
  26. ChowdhuryA.S. ReehlS.M. Kehn-HallK. BishopB. Webb-RobertsonB.J.M. Better understanding and prediction of antiviral peptides through primary and secondary structure feature importance.Sci. Rep.20201011926010.1038/s41598‑020‑76161‑833159146
    [Google Scholar]
  27. PangY. YaoL. JhongJ.H. WangZ. LeeT.Y. AVPIden: A new scheme for identification and functional prediction of antiviral peptides based on machine learning approaches.Brief. Bioinform.2021226bbab26310.1093/bib/bbab26334279599
    [Google Scholar]
  28. PangY. WangZ. JhongJ.H. LeeT.Y. Identifying anti-coronavirus peptides by incorporating different negative datasets and imbalanced learning strategies.Brief. Bioinform.20212221085109510.1093/bib/bbaa42333497434
    [Google Scholar]
  29. KurataH. TsukiyamaS. ManavalanB. iACVP: Markedly enhanced identification of anti-coronavirus peptides using a dataset-specific word2vec model.Brief. Bioinform.2022234bbac26510.1093/bib/bbac26535772910
    [Google Scholar]
  30. TimmonsP.B. HewageC.M. ENNAVIA is a novel method which employs neural networks for antiviral and anti-coronavirus activity prediction for therapeutic peptides.Brief. Bioinform.2021226bbab25810.1093/bib/bbab25834297817
    [Google Scholar]
  31. CharoenkwanP. AnuwongcharoenN. NantasenamatC. HasanM.M. ShoombuatongW. In silico approaches for the prediction and analysis of antiviral peptides: A review.Curr. Pharm. Des.202127182180218810.2174/18734286MTExjMDcnz33138759
    [Google Scholar]
  32. ManavalanB. BasithS. LeeG. Comparative analysis of machine learning-based approaches for identifying therapeutic peptides targeting SARS-CoV-2.Brief. Bioinform.2022231bbab41210.1093/bib/bbab41234595489
    [Google Scholar]
  33. HeydariH. GolmohammadiR. MirnejadR. TebyanianH. Fasihi-RamandiM. Moosazadeh MoghaddamM. Antiviral peptides against Coronaviridae family: A review.Peptides202113917052610.1016/j.peptides.2021.17052633676968
    [Google Scholar]
  34. MustafaS. BalkhyH. GabereM. Peptide-protein interaction studies of antimicrobial peptides targeting middle east respiratory syndrome coronavirus spike protein: An in silico approach.Adv. Bioinforma.2019201911610.1155/2019/681510531354813
    [Google Scholar]
  35. BatemanA. MartinM-J. OrchardS. MagraneM. AhmadS. AlpiE. Bowler-BarnettE.H. BrittoR. Bye-A-JeeH. CukuraA. DennyP. DoganT. EbenezerT.G. FanJ. GarmiriP. da Costa GonzalesL.J. Hatton-EllisE. HusseinA. IgnatchenkoA. InsanaG. IshtiaqR. JoshiV. JyothiD. KandasaamyS. LockA. LucianiA. LugaricM. LuoJ. LussiY. MacDougallA. MadeiraF. MahmoudyM. MishraA. MoulangK. NightingaleA. PundirS. QiG. RajS. RaposoP. RiceD.L. SaidiR. SantosR. SperettaE. StephensonJ. TotooP. TurnerE. TyagiN. VasudevP. WarnerK. WatkinsX. ZaruR. ZellnerH. BridgeA.J. AimoL. Argoud-PuyG. AuchinclossA.H. AxelsenK.B. BansalP. BaratinD. Batista NetoT.M. BlatterM-C. BollemanJ.T. BoutetE. BreuzaL. GilB.C. Casals-CasasC. EchioukhK.C. CoudertE. CucheB. de CastroE. EstreicherA. FamigliettiM.L. FeuermannM. GasteigerE. GaudetP. GehantS. GerritsenV. GosA. GruazN. HuloC. Hyka-NouspikelN. JungoF. KerhornouA. Le MercierP. LieberherrD. MassonP. MorgatA. MuthukrishnanV. PaesanoS. PedruzziI. PilboutS. PourcelL. PouxS. PozzatoM. PruessM. RedaschiN. RivoireC. SigristC.J.A. SonessonK. SundaramS. WuC.H. ArighiC.N. ArminskiL. ChenC. ChenY. HuangH. LaihoK. McGarveyP. NataleD.A. RossK. VinayakaC.R. WangQ. WangY. ZhangJ. UniProt Consortium UniProt: The universal protein knowledgebase in 2023.Nucleic Acids Res.202351D1D523D53110.1093/nar/gkac105236408920
    [Google Scholar]
  36. ZengC. ZouL. An account of in silico identification tools of secreted effector proteins in bacteria and future challenges.Brief. Bioinform.201920111012910.1093/bib/bbx07828981574
    [Google Scholar]
  37. YuY. SiX. HuC. ZhangJ. A review of recurrent neural networks: LSTM cells and network architectures.Neural Comput.20193171235127010.1162/neco_a_0119931113301
    [Google Scholar]
  38. LuoM. LiS. PangY. YaoL. MaR. HuangH.Y. HuangH.D. LeeT.Y. Extraction of microRNA-target interaction sentences from biomedical literature by deep learning approach.Brief. Bioinform.2023241bbac49710.1093/bib/bbac49736440972
    [Google Scholar]
  39. KothariA.N. ChatGPT, large language models, and generative AI as future augments of surgical cancer care.Ann. Surg. Oncol.20233063174317610.1245/s10434‑023‑13442‑237052826
    [Google Scholar]
  40. RivesA. MeierJ. SercuT. GoyalS. LinZ. LiuJ. GuoD. OttM. ZitnickC.L. MaJ. FergusR. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.Proc. Natl. Acad. Sci.202111815e201623911810.1073/pnas.201623911833876751
    [Google Scholar]
  41. DuZ. DingX. XuY. LiY. UniDL4BioPep: A universal deep learning architecture for binary classification in peptide bioactivity.Brief. Bioinform.2023243bbad13510.1093/bib/bbad13537020337
    [Google Scholar]
  42. LinZ. AkinH. RaoR. HieB. ZhuZ. LuW. SmetaninN. VerkuilR. KabeliO. ShmueliY. dos Santos CostaA. Fazel-ZarandiM. SercuT. CandidoS. RivesA. Evolutionary-scale prediction of atomic-level protein structure with a language model.Science202337966371123113010.1126/science.ade257436927031
    [Google Scholar]
  43. LiuQ. XiaF. YinQ. JiangR. Chromatin accessibility prediction via a hybrid deep convolutional neural network.Bioinformatics201834573273810.1093/bioinformatics/btx67929069282
    [Google Scholar]
  44. SergeyL. ChristianS. Batch normalization: Accelerating deep network training by reducing internal covariate shift.Proceedings of the 32nd International Conference on International Conference on Machine Learning201537448456
    [Google Scholar]
  45. XiaoX. ShaoY.T. ChengX. StamatovicB. iAMP-CA2L: A new CNN-BiLSTM-SVM classifier based on cellular automata image for identifying antimicrobial peptides and their functional types.Brief. Bioinform.2021226bbab20910.1093/bib/bbab20934086856
    [Google Scholar]
  46. LiZ. FangJ. WangS. ZhangL. ChenY. PianC. Adapt-Kcr: A novel deep learning framework for accurate prediction of lysine crotonylation sites based on learning embedding features and attention architecture.Brief. Bioinform.2022232bbac03710.1093/bib/bbac03735189635
    [Google Scholar]
  47. LeN.Q.K. HoQ.T. NguyenT.T.D. OuY.Y. A transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers from sequence information.Brief. Bioinform.2021225bbab00510.1093/bib/bbab00533539511
    [Google Scholar]
  48. WooS. ParkJ. LeeJ-Y. So KweonI. 15th European ConferenceMunich, Germany, Springer-VerlagBerlin, Heidelberg2018319
    [Google Scholar]
  49. SunM. WangZ. A person re-identification network based upon channel attention and self-attention.2021 IEEE 6th International Conference on Signal and Image ProcessingNanjing, China2021616510.1109/ICSIP52628.2021.9688968
    [Google Scholar]
  50. ZhouB. KhoslaA. LapedrizaA. OlivaA. TorralbaA. Learning deep features for discriminative localization.2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)Las Vegas, NV, USA, 20162921292910.1109/CVPR.2016.319
    [Google Scholar]
  51. XiaoC. WangJ. YangS. HengM. SuJ. XiaoH. SongJ. LiW. VISN: Virus instance segmentation network for TEM images using deep attention transformer.Brief. Bioinform.2023246bbad37310.1093/bib/bbad37337903415
    [Google Scholar]
  52. LuoY. WangZ. An improved resnet algorithm based on CBAM.2021 International Conference on Computer Network, Electronic and Automation (ICCNEA)Xi'an, China, 202112112510.1109/ICCNEA53019.2021.00036
    [Google Scholar]
  53. YangN. HeC. Malaria detection based on ResNet + CBAM attention mechanism.2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS)Guangzhou, China2022271275
    [Google Scholar]
  54. JiaoY. DuP. Performance measures in evaluating machine learning based bioinformatics predictors for classifications.Quant. Biol.20164432033010.1007/s40484‑016‑0081‑2
    [Google Scholar]
  55. GuoX. JiangY. ZouQ. Structured sparse regularized TSK fuzzy system for predicting therapeutic peptides.Brief. Bioinform.2022233bbac13510.1093/bib/bbac13535438149
    [Google Scholar]
  56. FangY. XuF. WeiL. JiangY. ChenJ. WeiL. WeiD.Q. AFP-MFL: Accurate identification of antifungal peptides using multi-view feature learning.Brief. Bioinform.2023241bbac60610.1093/bib/bbac60636631407
    [Google Scholar]
  57. ZhangS.L. JingY.Y. PreVFs-RG: A deep hybrid model for identifying virulence factors based on residual block and gated recurrent unit, iss. 1.IEEE/ACM Trans. Comput. Biol. Bioinformatics20221910.1109/TCBB.2022.3149864
    [Google Scholar]
  58. LiX. ZhangS. ShiH. An improved residual network using deep fusion for identifying RNA 5-methylcytosine sites.Bioinformatics202238184271427710.1093/bioinformatics/btac53235866985
    [Google Scholar]
  59. ShiH. ZhangS. LiX. R5hmCFDV: Computational identification of RNA 5-hydroxymethylcytosine based on deep feature fusion and deep voting.Brief. Bioinform.2022235bbac34110.1093/bib/bbac34135945157
    [Google Scholar]
  60. PengX. WangX. GuoY. GeZ. LiF. GaoX. SongJ. RBP-TSTL is a two-stage transfer learning framework for genome-scale prediction of RNA-binding proteins.Brief. Bioinform.2022234bbac21510.1093/bib/bbac21535649392
    [Google Scholar]
  61. ChenJ. KojuW. XuS. LiuZ. Sales forecasting using deep neural network and SHAP techniques.2021 IEEE 2nd International Conference on Big DataArtificial Intelligence and Internet of Things Engineering202113513810.1109/ICBAIE52039.2021.9389930
    [Google Scholar]
  62. ChoudaryM.N.S. BommineniV.B. TarunG. ReddyG.P. GopakumarG. Second international conference on electronics and sustainable communication systems (ICESC)Coimbatore, India, 202118921896
    [Google Scholar]
  63. PrendinF. PavanJ. CapponG. Del FaveroS. SparacinoG. FacchinettiA. The importance of interpreting machine learning models for blood glucose prediction in diabetes: An analysis using SHAP.Sci. Rep.20231311686510.1038/s41598‑023‑44155‑x37803177
    [Google Scholar]
  64. KumarC.S. ChoudaryM.N.S. BommineniV.B. TarunG. AnjaliT. Dimensionality reduction based on SHAP analysis: A simple and trustworthy approach.2020 International Conference on Communication and Signal Processing (ICCSP)Chennai, India, 202055856010.1109/ICCSP48568.2020.9182109
    [Google Scholar]
  65. WeiL. HeW. MalikA. SuR. CuiL. ManavalanB. Computational prediction and interpretation of cell-specific replication origin sites from multiple eukaryotes by exploiting stacking framework.Brief. Bioinform.2021224bbaa27510.1093/bib/bbaa27533152766
    [Google Scholar]
  66. ShiH. ZhangS. Accurate prediction of anti-hypertensive peptides based on convolutional neural network and gated recurrent unit.Interdiscip. Sci.202214487989410.1007/s12539‑022‑00521‑335474167
    [Google Scholar]
  67. SorkhiA.G. PirgaziJ. GhasemiV. A hybrid feature extraction scheme for efficient malonylation site prediction.Sci. Rep.2022121575610.1038/s41598‑022‑08555‑935388017
    [Google Scholar]
  68. ManavalanB. BasithS. ShinT.H. WeiL. LeeG. AtbPpred: A robust sequence-based prediction of anti-tubercular peptides using extremely randomized trees.Comput. Struct. Biotechnol. J.20191797298110.1016/j.csbj.2019.06.02431372196
    [Google Scholar]
  69. YuH. LuoX. IPPF-FE: An integrated peptide and protein function prediction framework based on fused features and ensemble models.Brief. Bioinform.2023241bbac47610.1093/bib/bbac47636403184
    [Google Scholar]
  70. BatesStephen Cross-validation: What does it estimate and how well does it do it?J. American Stat. Assoc.2021111
    [Google Scholar]
  71. SrivastavaN. HintonG. KrizhevskyA. Dropout: A simple way to prevent neural networks from overfitting.J. Mach. Learn. Res.201415119291958
    [Google Scholar]
  72. ChoiH. LeeH. Exploiting all samples in low-resource sentence classification: Early stopping and initialization parameters.IEEE Access202311307683078210.1109/ACCESS.2023.3261884
    [Google Scholar]
  73. ZhouB. DingM. FengJ. JiB. HuangP. ZhangJ. YuX. CaoZ. YangY. ZhouY. WangJ. EVlncRNA-Dpred: Improved prediction of experimentally validated lncRNAs by deep learning.Brief. Bioinform.2023241bbac58310.1093/bib/bbac58336573492
    [Google Scholar]
/content/journals/cmc/10.2174/0109298673287899240303164403
Loading
/content/journals/cmc/10.2174/0109298673287899240303164403
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test