Skip to content
2000
Volume 21, Issue 1
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Background

G-protein coupled receptors (GPCRs) represent a large family of membrane proteins, distinguished by their seven-transmembrane helical structures. These receptors play a pivotal role in numerous physiological processes. Nowadays, many researchers have proposed computational methods to identify GPCRs. In the past, we introduced a powerful method, EMCBOW-GPCR, which was designed for this purpose. However, the feature extraction technique employed is susceptible to out-of-vocabulary errors, indicating the potential for enhanced accuracy in GPCR identification.

Methods

To solve the challenges, we propose a novel approach termed GPCR-AFPN. This method leverages the FastText algorithm to effectively extract features from protein sequences. Additionally, it employs a powerful deep neural network as the predictive model to improve prediction accuracy.

Results

To validate the efficacy of the proposed GPCR-AFPN method, we conducted five-fold cross-validation and independent tests, respectively. The experimental results indicate that GPCR-AFPN outperforms existing methods.

Conclusion

Overall, our proposed method, GPCR-AFPN, can improve the accuracy of GPCR identification. For the convenience of researchers interested in applying our latest advancements, a user-friendly webserver for GPCR-AFPN is available at www.lzzzlab.top/gpcrafpn/, and the corresponding code can be downloaded at https://github.com/454170054/GPCR-AFPN.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/0115748936349783250101124112
2025-01-08
2026-02-16
Loading full text...

Full text loading...

References

  1. QiuW. LvZ. XiaoX. ShaoS. LinH. EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks.Comput. Struct. Biotechnol. J.2021194961496910.1016/j.csbj.2021.08.044 34527200
    [Google Scholar]
  2. BegumK. MohlJ.E. AyivorF. PerezE.E. LeungM.Y. GPCR-PEnDB: A database of protein sequences and derived features to facilitate prediction and classification of G protein-coupled receptors.Database20202020baaa087 33216895
    [Google Scholar]
  3. ArmstrongJ.F. FaccendaE. HardingS.D. The IUPHAR/BPS Guide to pharmacology in 2020: Extending immunopharmacology content and introducing the IUPHAR/MMV Guide to malaria pharmacology.Nucleic Acids Res.202048D1D1006D1021 31691834
    [Google Scholar]
  4. HuG.M. MaiT.L. ChenC.M. Visualizing the GPCR network: Classification and evolution.Sci. Rep.2017711549510.1038/s41598‑017‑15707‑9 29138525
    [Google Scholar]
  5. LagerströmM.C. SchiöthH.B. Structural diversity of G protein-coupled receptors and significance for drug discovery.Nat. Rev. Drug Discov.20087433935710.1038/nrd2518 18382464
    [Google Scholar]
  6. FredrikssonR. LagerströmM.C. LundinL.G. SchiöthH.B. The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints.Mol. Pharmacol.20036361256127210.1124/mol.63.6.1256 12761335
    [Google Scholar]
  7. RameshM. SolimanM. G-protein coupled receptors (GPCRs): A comprehensive computational perspective.Comb. Chem. High Throughput Screen.201518434636410.2174/1386207318666150305155545 25747435
    [Google Scholar]
  8. JacobyE. BouhelalR. GerspacherM. SeuwenK. The 7 TM G-protein-coupled receptor target family.ChemMedChem20061876078210.1002/cmdc.200600134 16902930
    [Google Scholar]
  9. FoordS.M. BonnerT.I. NeubigR.R. International Union of Pharmacology. XLVI. G protein-coupled receptor list.Pharmacol. Rev.200557227928810.1124/pr.57.2.5 15914470
    [Google Scholar]
  10. CongreveM. de GraafC. SwainN.A. TateC.G. Impact of GPCR structures on drug discovery.Cell20201811819110.1016/j.cell.2020.03.003 32243800
    [Google Scholar]
  11. HilgerD. MasureelM. KobilkaB.K. Structure and dynamics of GPCR signaling complexes.Nat. Struct. Mol. Biol.201825141210.1038/s41594‑017‑0011‑7 29323277
    [Google Scholar]
  12. KobilkaB.K. DeupiX. Conformational complexity of G-protein-coupled receptors.Trends Pharmacol. Sci.200728839740610.1016/j.tips.2007.06.003 17629961
    [Google Scholar]
  13. ZhangR. XieX. Tools for GPCR drug discovery.Acta Pharmacol. Sin.201233337238410.1038/aps.2011.173 22266728
    [Google Scholar]
  14. HauserA.S. AttwoodM.M. Rask-AndersenM. SchiöthH.B. GloriamD.E. Trends in GPCR drug discovery: New agents, targets and indications.Nat. Rev. Drug Discov.2017161282984210.1038/nrd.2017.178 29075003
    [Google Scholar]
  15. PengZ.L. YangJ.Y. ChenX. An improved classification of G-protein-coupled receptors using sequence-derived features.BMC Bioinformatics201011142010.1186/1471‑2105‑11‑420 20696050
    [Google Scholar]
  16. BhasinM RaghavaGPS GPCRpred: An SVM-based method for prediction of families and subfamilies of G-protein coupled receptors.Nucleic Acids Res.200432Web Server)(Suppl. 2W383W38910.1093/nar/gkh41615215416
    [Google Scholar]
  17. VapnikV. The nature of statistical learning theory.New York, NYSpringer science & business media1999
    [Google Scholar]
  18. XiaoX. WangP. ChouK.C. GPCR‐CA: A cellular automaton image approach for predicting G‐protein–coupled receptor functional classes.J. Comput. Chem.20093091414142310.1002/jcc.21163 19037861
    [Google Scholar]
  19. Zia-ur-Rehman, Khan A. Identifying GPCRs and their types with Chou’s pseudo amino acid composition: An approach from multi-scale energy representation and position specific scoring matrix.Protein Pept. Lett.201219889090310.2174/092986612801619589 22316312
    [Google Scholar]
  20. NieG. LiY. WangF. WangS. HuX. A novel fractal approach for predicting G-protein–coupled receptors and their subfamilies with support vector machines.Biomed. Mater. Eng.201526s1Suppl. 1S1829S183610.3233/BME‑151485 26405954
    [Google Scholar]
  21. LiaoZ. JuY. ZouQ. Prediction of G protein-coupled receptors with SVM-Prot features and random forest.Scientifica2016201611010.1155/2016/8309253 27529053
    [Google Scholar]
  22. CaiC.Z. HanL.Y. JiZ.L. ChenX. ChenY.Z. SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence.Nucleic Acids Res.200331133692369710.1093/nar/gkg600 12824396
    [Google Scholar]
  23. BreimanL. Random forests.Mach. Learn.200145153210.1023/A:1010933404324
    [Google Scholar]
  24. AoC GaoL YuL. Identifying G-protein coupled receptors using mixed-feature extraction methods and machine learning methods.IEEE Access2020611
    [Google Scholar]
  25. ZouQ. ZengJ. CaoL. JiR. A novel features ranking metric with application to scalable visual and bioinformatics data classification.Neurocomputing201617334635410.1016/j.neucom.2014.12.123
    [Google Scholar]
  26. BekhoucheS. Mohamed Ben AliY. Feature selection in GPCR classification using BAT algorithm.Int J Comput Intell Appl2020191205000610.1142/S1469026820500066
    [Google Scholar]
  27. MikolovT. ChenK. CorradoG. Efficient estimation of word representations in vector space.Comput. Lang.20131130110.48550/arXiv.1301.3781
    [Google Scholar]
  28. QiuW. LvZ. HongY. JiaJ. XiaoX. BOW-GBDT: A GBDT classifier combining with artificial neural network for identifying GPCR–drug interaction based on wordbook learning from sequences.Front. Cell Dev. Biol.2021862385810.3389/fcell.2020.623858 33598456
    [Google Scholar]
  29. ChenT. GuestrinC. Xgboost: A scalable tree boosting system.Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data miningNew York, NY, USA201678579410.1145/2939672.2939785
    [Google Scholar]
  30. ZhangA. ZackC. MuL. Dive into deep learning.Cambridge, EnglandCambridge University Press2023583
    [Google Scholar]
  31. LeCunY. BengioY. HintonG. Deep learning.Nature2015521755343644410.1038/nature14539 26017442
    [Google Scholar]
  32. WangT. ZhuoL. ChenY. FuX. ZengX. ZouQ. ECD-CDGI: An efficient energy-constrained diffusion model for cancer driver gene identification.PLOS Comput. Biol.2024208e101240010.1371/journal.pcbi.1012400 39213450
    [Google Scholar]
  33. UllahM. AkbarS. RazaA. ZouQ. DeepAVP-TPPred: Identification of antiviral peptides using transformed image-based localized descriptors and binary tree growth algorithm.Bioinformatics2024405btae30510.1093/bioinformatics/btae305 38710482
    [Google Scholar]
  34. WangY. ChenZ. PanZ. RNAincoder: A deep learning-based encoder for RNA and RNA-associated interaction.Nucleic Acids Res.202351W1W509-1910.1093/nar/gkad404 37166951
    [Google Scholar]
  35. RafieiF. ZeraatiH. AbbasiK. GhasemiJ.B. ParsaeianM. Masoudi-NejadA. DeepTraSynergy: Drug combinations using multimodal deep learning with transformers.Bioinformatics2023398btad43810.1093/bioinformatics/btad438 37467066
    [Google Scholar]
  36. SalimyS. LanjanianH. AbbasiK. A deep learning-based framework for predicting survival-associated groups in colon cancer by integrating multi-omics and clinical data.Heliyon202397e1765310.1016/j.heliyon.2023.e17653 37455955
    [Google Scholar]
  37. AbbasiK. RazzaghiP. PosoA. Ghanbari-AraS. Masoudi-NejadA. Deep learning in drug target interaction prediction: Current and future perspectives.Curr. Med. Chem.202128112100211310.2174/1875533XMTA5qNzU62 32895036
    [Google Scholar]
  38. SongW. XuL. HanC. TianZ. ZouQ. Drug–target interaction predictions with multi-view similarity network fusion strategy and deep interactive attention mechanism.Bioinformatics2024406btae34610.1093/bioinformatics/btae346 38837345
    [Google Scholar]
  39. FloridiL. ChiriattiM. GPT-3: Its nature, scope, limits, and consequences.Minds Mach.202030468169410.1007/s11023‑020‑09548‑1
    [Google Scholar]
  40. DevlinJ. ChangM.W. LeeK. BERT: Pre-training of deep bidirectional transformers for language understanding.Comput. Lang.20181181010.48550/arXiv.1810.04805
    [Google Scholar]
  41. LiM. LingC. XuQ. GaoJ. Classification of G-protein coupled receptors based on a rich generation of convolutional neural network, N-gram transformation and multiple sequence alignments.Amino Acids201850225526610.1007/s00726‑017‑2512‑4 29151135
    [Google Scholar]
  42. LingC. WeiX. ShenY. ZhangH. Development and validation of multiple machine learning algorithms for the classification of G-protein-coupled receptors using molecular evolution model-based feature extraction strategy.Amino Acids202153111705171410.1007/s00726‑021‑03080‑x 34562175
    [Google Scholar]
  43. BatemanA. MartinM-J. OrchardS. UniProt: The universal protein knowledgebase in 2021.Nucleic Acids Res.202149D1D480D48910.1093/nar/gkaa1100 33237286
    [Google Scholar]
  44. PundirS. MartinM.J. O’DonovanC. UniProt Protein Knowledgebase.Protein Bioinformatics: From Protein Modifications and Networks to Proteomics. WuC.H. ArighiC.N. RossK.E. New York, NYSpringer New York2017415510.1007/978‑1‑4939‑6783‑4_2
    [Google Scholar]
  45. CoudertE. GehantS. de CastroE. Annotation of biologically relevant ligands in UniProtKB using ChEBI.Bioinformatics2023391btac79310.1093/bioinformatics/btac793 36484697
    [Google Scholar]
  46. FuL. NiuB. ZhuZ. WuS. LiW. CD-HIT: Accelerated for clustering the next-generation sequencing data.Bioinformatics201228233150315210.1093/bioinformatics/bts565 23060610
    [Google Scholar]
  47. LiW. GodzikA. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences.Bioinformatics200622131658165910.1093/bioinformatics/btl158 16731699
    [Google Scholar]
  48. DöringA. WeeseD. RauschT. ReinertK. SeqAn An efficient, generic C++ library for sequence analysis.BMC Bioinformatics2008911110.1186/1471‑2105‑9‑11 18184432
    [Google Scholar]
  49. DubchakI. MuchnikI. HolbrookS.R. KimS.H. Prediction of protein folding class using global description of amino acid sequence.Proc. Natl. Acad. Sci. USA199592198700870410.1073/pnas.92.19.8700 7568000
    [Google Scholar]
  50. LiZR LinHH HanLY JiangL ChenX ChenYZ PROFEAT: A web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence.Nucleic Acids Res200634Web Server)(Suppl. 2W32W3710.1093/nar/gkl30516845018
    [Google Scholar]
  51. ChouK.C. Prediction of protein subcellular locations by incorporating quasi-sequence-order effect.Biochem. Biophys. Res. Commun.2000278247748310.1006/bbrc.2000.3815 11097861
    [Google Scholar]
  52. GaoQ.B. JinZ.C. YeX.F. WuC. HeJ. Prediction of nuclear receptors with optimal pseudo amino acid composition.Anal. Biochem.20093871545910.1016/j.ab.2009.01.018 19454254
    [Google Scholar]
  53. QiuW.R. WangQ.K. GuanM.Y. JiaJ.H. XiaoX. Predicting S-nitrosylation proteins and sites by fusing multiple features.Math. Biosci. Eng.202118691329147 34814339
    [Google Scholar]
  54. ChouK.C. Prediction of protein cellular attributes using pseudo-amino acid composition.Proteins200143324625510.1002/prot.1035 11288174
    [Google Scholar]
  55. BojanowskiP. GraveE. JoulinA. MikolovT. Enriching word vectors with subword information.Trans. Assoc. Comput. Linguist.2017513514610.1162/tacl_a_00051
    [Google Scholar]
  56. LiY.H. XuJ.Y. TaoL. SVM-Prot 2016: A web-server for machine learning prediction of protein functional families from sequence irrespective of similarity.PLoS One2016118e015529010.1371/journal.pone.0155290 27525735
    [Google Scholar]
  57. HearstM.A. DumaisS.T. OsunaE. PlattJ. ScholkopfB. Support vector machines.IEEE Intell. Syst. Their Appl.1998134182810.1109/5254.708428
    [Google Scholar]
  58. PedregosaF. Scikit-learn: Machine learning in python.J. Mach. Learn. Res.20121228252830
    [Google Scholar]
  59. ChouK.C. CaiY.D. A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology.Biochem. Biophys. Res. Commun.2003311374374710.1016/j.bbrc.2003.10.062 14623335
    [Google Scholar]
  60. LinT-Y. Feature pyramid networks for object detection.Proceedings of the IEEE conference on computer vision and pattern recognitionHonolulu, HI, USA 21-26 July201793694410.1109/CVPR.2017.106
    [Google Scholar]
  61. LvZ. WeiX. HuS. LinG. QiuW. iSUMO-RsFPN: A predictor for identifying lysine SUMOylation sites based on multi-features and feature pyramid networks.Anal. Biochem.202468711546010.1016/j.ab.2024.115460 38191118
    [Google Scholar]
  62. HeK. ZhangX. RenS. Deep residual learning for image recognition.Proceedings of the IEEE conference on computer vision and pattern recognitionLas Vegas, NV, USA 27-30 June201677077810.1109/CVPR.2016.90
    [Google Scholar]
  63. VaswaniA. Attention is all you need.Proceedings of the 31st International Conference on Neural Information Processing SystemsLong Beach, California, USA201760006010
    [Google Scholar]
  64. LinZ FengM Nogueira dos SantosC A structured self-attentive sentence embedding Computat Lang20171170310.48550/arXiv.1703.03130
    [Google Scholar]
  65. HendrycksD. GimpelK. Gaussian error linear units (GELUs).Mach. Learn.20161160610.48550/arXiv.1606.08415
    [Google Scholar]
  66. AbadiM. AgarwalA. BarhamP. TensorFlow: Large-scale machine learning on heterogeneous distributed systems.Distrib Para Clus Compu20161160310.48550/arXiv.1603.04467
    [Google Scholar]
  67. KingmaD.P. BaJ. Adam: A method for stochastic optimization.Mach. Learn.20141141210.48550/arXiv.1412.6980
    [Google Scholar]
  68. SrivastavaN. Dropout: A simple way to prevent neural networks from overfitting.J. Mach. Learn. Res.20141519291958
    [Google Scholar]
  69. WangH. Early stopping for deep image prior.Compu Visi Patt Recog202140607410.48550/arXiv.2112.06074
    [Google Scholar]
  70. MoradiR. BerangiR. MinaeiB. A survey of regularization strategies for deep models.Artif. Intell. Rev.20205363947398610.1007/s10462‑019‑09784‑7
    [Google Scholar]
  71. JumperJ. EvansR. PritzelA. Highly accurate protein structure prediction with AlphaFold.Nature2021596787358358910.1038/s41586‑021‑03819‑2 34265844
    [Google Scholar]
  72. BrandesN. OferD. PelegY. RappoportN. LinialM. ProteinBERT: A universal deep-learning model of protein sequence and function.Bioinformatics20223882102211010.1093/bioinformatics/btac020 35020807
    [Google Scholar]
  73. RivesA. MeierJ. SercuT. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.Proc. Natl. Acad. Sci. USA202111815e201623911810.1073/pnas.2016239118 33876751
    [Google Scholar]
  74. ZhangH ZhouY ZhangZ Large language model-based natural language encoding could be all you need for drug biomedical association prediction.Anal Chem20249630acs.analchem.4c0179310.1021/acs.analchem.4c0179339011990
    [Google Scholar]
/content/journals/cbio/10.2174/0115748936349783250101124112
Loading
/content/journals/cbio/10.2174/0115748936349783250101124112
Loading

Data & Media loading...


  • Article Type:
    Research Article
Keyword(s): deep learning; feature extraction; GPCRs; support vector machine; webserver; word embedding
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test