Skip to content
2000
Volume 20, Issue 3
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Background

Currently, various types of peptides have broad implications for human health and disease. Some drug peptides play significant roles in sensory science, drug research, and cancer biology. The prediction and classification of peptide sequences are of significant importance to various industries. However, predicting peptide sequences through biological experiments is a time-consuming and expensive process. Moreover, the task of protein sequence classification and prediction faces challenges due to the high dimensionality, nonlinearity, and irregularity of protein sequence data, along with the presence of numerous unknown or unlabeled protein sequences. Therefore, an accurate and efficient method for predicting peptide category is necessary.

Methods

In our work, we used two pre-trained models to extract sequence features, TextCNN (Convolutional Neural Networks for Text Classification) and Transformer. We extracted the overall semantic information of the sequences using Transformer Encoder and extracted the local semantic information between sequences using TextCNN and concatenated them into a new feature. Finally, we used the concatenated feature for classification prediction. To validate this approach, we conducted experiments on the BP dataset, THP dataset and DPP-IV dataset and compared them with some pre-trained models.

Results

Since TextCNN and Transformer Encoder extract features from different perspectives, the concatenated feature contains multi-view information, which improves the accuracy of the peptide predictor.

Conclusion

Ultimately, our model demonstrated superior metrics, highlighting its efficacy in peptide sequence prediction and classification.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/0115748936294345240510112941
2024-05-24
2025-05-25
Loading full text...

Full text loading...

References

  1. VadevooS.M.P. GurungS. LeeH.S. GunassekaranG.R. LeeS.M. YoonJ.W. LeeY.K. LeeB. Peptides as multifunctional players in cancer therapy.Exp. Mol. Med.20235561099110910.1038/s12276‑023‑01016‑x 37258584
    [Google Scholar]
  2. HmedB. SerriaH.T. MounirZ.K. Scorpion peptides: Potential use for new drug development.J. Toxicol.2013201395879710.1155/2013/958797
    [Google Scholar]
  3. KurrikoffK. AphkhazavaD. LangelÜ. The future of peptides in cancer treatment.Curr. Opin. Pharmacol.201947273210.1016/j.coph.2019.01.008 30856511
    [Google Scholar]
  4. Le JoncourV. LaakkonenP. Seek & destroy, use of targeting peptides for cancer detection and drug delivery.Bioorg. Med. Chem.201826102797280610.1016/j.bmc.2017.08.052 28893601
    [Google Scholar]
  5. LiZ. ChoC. Development of peptides as potential drugs for cancer therapy.Curr. Pharm. Des.201016101180118910.2174/138161210790945913 20166989
    [Google Scholar]
  6. Li-ChanE.C.Y. Bioactive peptides and protein hydrolysates: Research trends and challenges for application as nutraceuticals and functional food ingredients.Curr. Opin. Food Sci.20151283710.1016/j.cofs.2014.09.005
    [Google Scholar]
  7. MantisC. KandelaI. AirdF. Replication Study: Coadministration of a tumor-penetrating peptide enhances the efficacy of cancer drugs.eLife20176e1758410.7554/eLife.17584 28100395
    [Google Scholar]
  8. MuttenthalerM. KingG.F. AdamsD.J. AlewoodP.F. Trends in peptide drug discovery.Nat. Rev. Drug Discov.202120430932510.1038/s41573‑020‑00135‑8 33536635
    [Google Scholar]
  9. Sarafraz-YazdiE. PincusM. MichlJ. Tumor-targeting peptides and small molecules as anti-cancer agents to overcome drug resistance.Curr. Med. Chem.201421141618163010.2174/09298673113209990223 23992333
    [Google Scholar]
  10. SnyderE.L. DowdyS.F. Cell penetrating peptides in drug delivery.Pharm. Res.200421338939310.1023/B:PHAM.0000019289.61978.f5 15070086
    [Google Scholar]
  11. TorchilinV.P. LukyanovA.N. Peptide and protein drug delivery to and into tumors: Challenges and solutions.Drug Discov. Today20038625926610.1016/S1359‑6446(03)02623‑0 12623240
    [Google Scholar]
  12. YavariB. MahjubR. SaidijamM. RaiganiM. SoleimaniM. The potential use of peptides in cancer treatment.Curr. Protein Pept. Sci.201819875977010.2174/1389203719666180111150008 29332577
    [Google Scholar]
  13. WangD. JinJ. LiZ. WangY. FanM. LiangS. SuR. WeiL. StructuralDPPIV: A novel deep learning model based on atom structure for predicting dipeptidyl peptidase-IV inhibitory peptides.Bioinformatics2024402btae05710.1093/bioinformatics/btae057 38305458
    [Google Scholar]
  14. LiZ. JinJ. WangY. LongW. DingY. HuH. WeiL. ExamPle: Explainable deep learning framework for the prediction of plant small secreted peptides.Bioinformatics2023393btad10810.1093/bioinformatics/btad108 36897030
    [Google Scholar]
  15. MaehashiK. HuangL. Bitter peptides and bitter taste receptors.Cell. Mol. Life Sci.200966101661167110.1007/s00018‑009‑8755‑9 19153652
    [Google Scholar]
  16. Alzheimer’s disease facts and figures.Alzheimers Dement.20231941598169510.1002/alz.13016 36918389
    [Google Scholar]
  17. HuY. SunJ. ZhangY. ZhangH. GaoS. WangT. HanZ. WangL. SunB. LiuG. rs1990622 variant associates with Alzheimer’s disease and regulates TMEM106B expression in human brain tissues.BMC Med.20211911110.1186/s12916‑020‑01883‑5 33461566
    [Google Scholar]
  18. HuY. ZhangY. ZhangH. GaoS. WangL. WangT. HanZ. LiuG. Mendelian randomization highlights causal association between genetically increased C‐reactive protein levels and reduced Alzheimer’s disease risk.Alzheimers Dement.202218102003200610.1002/alz.12687 35598332
    [Google Scholar]
  19. HuY. ZhangY. ZhangH. GaoS. WangL. WangT. HanZ. SunB. LiuG. Cognitive performance protects against Alzheimer’s disease independently of educational attainment and intelligence.Mol. Psychiatry202227104297430610.1038/s41380‑022‑01695‑4 35840796
    [Google Scholar]
  20. HuY. ZhangH. LiuB. GaoS. WangT. HanZ. JiX. LiuG. rs34331204 regulates TSPAN13 expression and contributes to Alzheimer’s disease with sex differences.Brain202014311e9510.1093/brain/awaa302 33175954
    [Google Scholar]
  21. ZhouH. WangH. DingY. TangJ. Multivariate information fusion for identifying antifungal peptides with hilbert-schmidt independence criterion.Curr. Bioinform.20221718910010.2174/1574893616666210727161003
    [Google Scholar]
  22. LiZ. JinJ. HeW. LongW. YuH. GaoX. NakaiK. ZouQ. WeiL. CoraL. Interpretable contrastive meta-learning for the prediction of cancer-associated ncRNA-encoded small peptides.Brief. Bioinform.2023246bbad35210.1093/bib/bbad352 37861173
    [Google Scholar]
  23. QiangX. ZhouC. YeX. DuP.F. SuR. WeiL. CPPred-FL: A sequence-based predictor for large-scale identification of cell-penetrating peptides by feature representation learning.Brief. Bioinform.20202111123 30239616
    [Google Scholar]
  24. KondoE. IiokaH. SaitoK. Tumor‐homing peptide and its utility for advanced cancer medicine.Cancer Sci.202111262118212510.1111/cas.14909 33793015
    [Google Scholar]
  25. RezendeS.B. LimaL.R. MacedoM.L.R. FrancoO.L. CardosoM.H. Advances in peptide/protein structure prediction tools and their relevance for structural biology in the last decade.Curr. Bioinform.202318755957510.2174/1574893618666230412080702
    [Google Scholar]
  26. CharoenkwanP. ChumnanpuenP. SchaduangratN. OhC. ManavalanB. ShoombuatongW. PSRQSP: An effective approach for the interpretable prediction of quorum sensing peptide using propensity score representation learning.Comput. Biol. Med.202315810678410.1016/j.compbiomed.2023.106784 36989748
    [Google Scholar]
  27. JiangY. WangR. FengJ. JinJ. LiangS. LiZ. YuY. MaA. SuR. ZouQ. MaQ. WeiL. Explainable deep hypergraph learning modeling the peptide secondary structure prediction.Adv. Sci.20231011220615110.1002/advs.202206151 36794291
    [Google Scholar]
  28. ChenL. YuL. GaoL. Potent antibiotic design via guided search from antibacterial activity evaluations.Bioinformatics2023392btad05910.1093/bioinformatics/btad059 36707990
    [Google Scholar]
  29. YanK. LvH. GuoY. PengW. LiuB. sAMPpred-GAT: prediction of antimicrobial peptide by graph attention network and predicted peptide structure.Bioinformatics2023391btac71510.1093/bioinformatics/btac715 36342186
    [Google Scholar]
  30. TangW. WanS. YangZ. TeschendorffA.E. ZouQ. Tumor origin detection with tissue-specific miRNA and DNA methylation markers.Bioinformatics201834339840610.1093/bioinformatics/btx622 29028927
    [Google Scholar]
  31. CaoC. WangJ. KwokD. CuiF. ZhangZ. ZhaoD. LiM.J. ZouQ. webTWAS: A resource for disease candidate susceptibility genes identified by transcriptome-wide association study.Nucleic Acids Res.202250D1D1123D113010.1093/nar/gkab957 34669946
    [Google Scholar]
  32. FlattP.R. BaileyC.J. GreenB.D. Dipeptidyl peptidase IV (DPP IV) and related molecules in type 2 diabetes.Front. Biosci.2008133648366010.2741/2956 18508462
    [Google Scholar]
  33. KshirsagarA.D. AggarwalA.S. HarleU.N. DeshpandeA.D. DPP IV inhibitors: Successes, failures and future prospects.Diabetes Metab. Syndr.20115210511210.1016/j.dsx.2012.02.017 22813415
    [Google Scholar]
  34. LambeirA.M. DurinxC. ScharpéS. De MeesterI. Dipeptidyl-peptidase IV from bench to bedside: An update on structural properties, functions, and clinical aspects of the enzyme DPP IV.Crit. Rev. Clin. Lab. Sci.200340320929410.1080/713609354 12892317
    [Google Scholar]
  35. VillhauerE.B. CoppolaG.M. HughesT.E. DPP-IV inhibition and therapeutic potential.Annu. Rep. Med. Chem.200136191200
    [Google Scholar]
  36. VilsbøllT. KnopF.K. Review: DPP IV inhibitors - Current evidence and future directions.Br. J. Diabetes Vasc. Dis.200772697410.1177/14746514070070020401
    [Google Scholar]
  37. TorresM.D.T. CaoJ. FrancoO.L. LuT.K. de la NunezF.C. Synthetic biology and computer-based frameworks for antimicrobial peptide discovery.ACS Nano20211522143216410.1021/acsnano.0c09509 33538585
    [Google Scholar]
  38. RuX. YeX. SakuraiT. ZouQ. NerLTR-DTA: Drug–target binding affinity prediction based on neighbor relationship and learning to rank.Bioinformatics20223871964197110.1093/bioinformatics/btac048 35134828
    [Google Scholar]
  39. ZouX. RenL. CaiP. ZhangY. DingH. DengK. YuX. LinH. HuangC. Accurately identifying hemagglutinin using sequence information and machine learning methods.Front. Med.202310128188010.3389/fmed.2023.1281880 38020152
    [Google Scholar]
  40. ZhuW. YuanS.S. LiJ. HuangC.B. LinH. LiaoB. A first computational frame for recognizing heparin-binding protein.Diagnostics20231314246510.3390/diagnostics13142465 37510209
    [Google Scholar]
  41. TangY.J. PangY.H. LiuB. IDP-Seq2Seq: Identification of intrinsically disordered regions based on sequence to sequence learning.Bioinformatics202136215177518610.1093/bioinformatics/btaa667 32702119
    [Google Scholar]
  42. WangY. SBSM-Pro: Support bio-sequence machine for proteins.arXiv:2308.102752023
  43. WeiL. ZouQ. Recent progress in machine learning-based methods for protein fold recognition.Int. J. Mol. Sci.20161712211810.3390/ijms17122118 27999256
    [Google Scholar]
  44. AoC. Biological sequence classification: A review on data and general methods.Research20222022001110.34133/research.0011
    [Google Scholar]
  45. WangZ. MengJ. LiH. XiaS. WangY. LuanY. PAMPred: A hierarchical evolutionary ensemble framework for identifying plant antimicrobial peptides.Comput. Biol. Med.202316610754510.1016/j.compbiomed.2023.107545 37806057
    [Google Scholar]
  46. WangR. JiangY. JinJ. YinC. YuH. WangF. FengJ. SuR. NakaiK. ZouQ. WeiL. DeepBIO: An automated and interpretable deep-learning platform for high-throughput biological sequence prediction, functional annotation and visualization analysis.Nucleic Acids Res.20235173017302910.1093/nar/gkad055 36796796
    [Google Scholar]
  47. LiH.L. PangY.H. LiuB. BioSeq-BLM: A platform for analyzing DNA, RNA and protein sequences based on biological language models.Nucleic Acids Res.20214922e12910.1093/nar/gkab829 34581805
    [Google Scholar]
  48. QianY. DingY. ZouQ. GuoF. Multi-view kernel sparse representation for identification of membrane protein types.IEEE/ACM Trans. Comput. Biol. Bioinformatics20232021234124510.1109/TCBB.2022.3191325 35857734
    [Google Scholar]
  49. MengQ. GuoF. WangE. TangJ. ComDock: A novel approach for protein-protein docking with an efficient fusing strategy.Comput. Biol. Med.202316710766010766010.1016/j.compbiomed.2023.107660 37944303
    [Google Scholar]
  50. ZhangW. MengQ. WangJ. GuoF. HDIContact: A novel predictor of residue–residue contacts on hetero-dimer interfaces via sequential information and transfer learning strategy.Brief. Bioinform.2022234bbac16910.1093/bib/bbac169 35653713
    [Google Scholar]
  51. QiaoJ. JinJ. YuH. WeiL. Towards retraining-free RNA modification prediction with incremental learning.Inf. Sci.202466012010510.1016/j.ins.2024.120105
    [Google Scholar]
  52. HeW. JiangY. JinJ. LiZ. ZhaoJ. ManavalanB. SuR. GaoX. WeiL. Accelerating bioactive peptide discovery via mutual information-based meta-learning.Brief. Bioinform.2022231bbab49910.1093/bib/bbab499 34882225
    [Google Scholar]
  53. CharoenkwanP. NantasenamatC. HasanM.M. MoniM.A. Lio’P. ShoombuatongW. iBitter-fuse: A novel sequence-based bitter peptide predictor by fusing multi-view features.Int. J. Mol. Sci.20212216895810.3390/ijms22168958 34445663
    [Google Scholar]
  54. CharoenkwanP. NantasenamatC. HasanM.M. ManavalanB. ShoombuatongW. BERT4Bitter: A bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides.Bioinformatics202137172556256210.1093/bioinformatics/btab133 33638635
    [Google Scholar]
  55. CharoenkwanP. YanaJ. SchaduangratN. NantasenamatC. HasanM.M. ShoombuatongW. iBitter-SCM: Identification and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides.Genomics202011242813282210.1016/j.ygeno.2020.03.019 32234434
    [Google Scholar]
  56. CharoenkwanP. KanthawongS. NantasenamatC. HasanM.M. ShoombuatongW. iDPPIV-SCM: A sequence-based predictor for identifying and analyzing dipeptidyl peptidase IV (DPP-IV) inhibitory peptides using a scoring card method.J. Proteome Res.202019104125413610.1021/acs.jproteome.0c00590 32897718
    [Google Scholar]
  57. AlshubailyI. TextCNN with attention for text classificationarXiv:2108.019212021
  58. VaswaniA. Attention is all you need.arXiv:1706.037622017
  59. ZhangT. YouF. Research on short text classification based on textcnn.J. Phys. Conf. Ser.2021175701209210.1088/1742‑6596/1757/1/012092
    [Google Scholar]
  60. ZhangH. NiW. LuoY. FengY. SongR. WangX. TUnet-LBF: Retinal fundus image fine segmentation model based on transformer Unet network and LBF.Comput. Biol. Med.202315910693710.1016/j.compbiomed.2023.106937 37084640
    [Google Scholar]
  61. JinJ. YuY. WangR. ZengX. PangC. JiangY. LiZ. DaiY. SuR. ZouQ. NakaiK. WeiL. iDNA-ABF: Multi-scale deep biological language learning model for the interpretable prediction of DNA methylations.Genome Biol.202223121910.1186/s13059‑022‑02780‑1 36253864
    [Google Scholar]
  62. QiR. GuoF. ZouQ. String kernels construction and fusion: A survey with bioinformatics application.Front. Comput. Sci.202216616690410.1007/s11704‑021‑1118‑x
    [Google Scholar]
  63. DingY. TangJ. GuoF. ZouQ. Identification of drug–target interactions via multiple kernel-based triple collaborative matrix factorization.Brief. Bioinform.2022232bbab58210.1093/bib/bbab582 35134117
    [Google Scholar]
  64. QiR. WuJ. GuoF. XuL. ZouQ. A spectral clustering with self-weighted multiple kernel learning method for single-cell RNA-seq data.Brief. Bioinform.2021224bbaa21610.1093/bib/bbaa216 33003206
    [Google Scholar]
  65. ShoombuatongW. SchaduangratN. PratiwiR. NantasenamatC. THPep: A machine learning-based approach for predicting tumor homing peptides.Comput. Biol. Chem.20198044145110.1016/j.compbiolchem.2019.05.008 31151025
    [Google Scholar]
  66. TangS. ChenL. iATC-NFMLP: Identifying classes of anatomical therapeutic chemicals based on drug networks, fingerprints, and multilayer perceptron.Curr. Bioinform.202217981482410.2174/1574893617666220318093000
    [Google Scholar]
  67. ZhangZ. SunG. ZhengK. YangJ.K. ZhuX. LiY. TC-Net: A joint learning framework based on CNN and vision transformer for multi-lesion medical images segmentation.Comput. Biol. Med.202316110696710.1016/j.compbiomed.2023.106967 37220707
    [Google Scholar]
  68. LiuB. GaoX. ZhangH. BioSeq-Analysis2.0: An updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches.Nucleic Acids Res.20194720e12710.1093/nar/gkz740 31504851
    [Google Scholar]
  69. PanS. ZhangY. WeiZ. MengJ. HuangD. Prediction and motif analysis of 2′-o-methylation using a hybrid deep learning model from RNA primary sequence and nanopore signals.Curr. Bioinform.202217987388210.2174/1574893617666220815153653
    [Google Scholar]
  70. LiH. WangY. QuM. CaoP. FengC. YangJ. EchoEFNet: Multi-task deep learning network for automatic calculation of left ventricular ejection fraction in 2D echocardiography.Comput. Biol. Med.202315610670510.1016/j.compbiomed.2023.106705 36863190
    [Google Scholar]
  71. AoC. YeX. SakuraiT. ZouQ. YuL. m5U-SVM: Identification of RNA 5-methyluridine modification sites based on multi-view features of physicochemical features and distributed representation.BMC Biol.20232119310.1186/s12915‑023‑01596‑0 37095510
    [Google Scholar]
  72. YangH. LuoY.M. MaC.Y. ZhangT.Y. ZhouT. RenX.L. HeX.L. DengK.J. YanD. TangH. LinH. A gender specific risk assessment of coronary heart disease based on physical examination data.NPJ Digit. Med.20236113610.1038/s41746‑023‑00887‑8 37524859
    [Google Scholar]
  73. DaoF.Y. LvH. FullwoodM.J. LinH. Accurate identification of DNA replication origin by fusing epigenomics and chromatin interaction information.Research202220222022/978029310.34133/2022/9780293 36405252
    [Google Scholar]
  74. LiH. LiuB. BioSeq-diabolo: Biological sequence similarity analysis using Diabolo.PLOS Comput. Biol.2023196e101121410.1371/journal.pcbi.1011214 37339155
    [Google Scholar]
/content/journals/cbio/10.2174/0115748936294345240510112941
Loading
/content/journals/cbio/10.2174/0115748936294345240510112941
Loading

Data & Media loading...


  • Article Type:
    Research Article
Keyword(s): Drug; fusion learning; multi-view feature; peptide sequence; TextCNN; transformer encoder
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test