Skip to content
2000
Volume 21, Issue 3
  • ISSN: 1570-1646
  • E-ISSN: 1875-6247

Abstract

Background

Hypothetical proteins (HPs) are those proteins whose functions are unknown; therefore, the present study was carried out to predict similarity-based functionality of HPs in selected bacteria and

Methods

Annotation-based approaches using Pfam, orthology, String, Bi-directional Best Blast Hit, PSLpred, Subloc, Cello, homology modeling, and computational tools were used in evaluating the functionality of HPs.

Results

Thirty-one domains in both bacterial species were retrieved based on the E-value score and compared with bacterial species already existing in databases. Statistical analysis was duly done to check which features performed well.

Conclusion

Out of 31 HPs found in strain , 14 domains were found to be uncharacterized in their functionality, while 2 uncharacterized domains in the case of were assigned a function on similarity-based approaches. The annotation of HPs is a challenge in bacteria as these are based on the similarity of proteins in other species.

Loading

Article metrics loading...

/content/journals/cp/10.2174/0115701646303687240805072304
2024-09-01
2025-07-10
Loading full text...

Full text loading...

References

  1. EmeL. DoolittleW.F. Archaea.Curr. Biol.20152519R851R85510.1016/j.cub.2015.05.02526439345
    [Google Scholar]
  2. LoucaS. MazelF. DoebeliM. ParfreyL.W. A census-based estimate of Earth’s bacterial and archaeal diversity.PLoS Biol.2019172e300010610.1371/journal.pbio.300010630716065
    [Google Scholar]
  3. RehmanK. ArslanM. MüllerJ.A. SaeedM. ImranA. AminI. MustafaT. IqbalS. AfzalM. Bioaugmentation-enhanced remediation of crude oil polluted water in pilot-scale floating treatment wetlands.Water20211320288210.3390/w13202882
    [Google Scholar]
  4. ArslanM. MüllerJ.A. Gamal El-DinM. Aerobic naphthenic acid-degrading bacteria in petroleum-coke improve oil sands process water remediation in biofilters: DNA-stable isotope probing reveals methylotrophy in Schmutzdecke.Sci. Total Environ.202281515196110.1016/j.scitotenv.2021.15196134843771
    [Google Scholar]
  5. GavriilidouA. MackenzieT.A. SánchezP. TormoJ.R. InghamC. SmidtH. SipkemaD. Bioactivity screening and gene-trait matching across marine sponge-associated bacteria.Mar. Drugs20211927510.3390/md1902007533573261
    [Google Scholar]
  6. MakarovaK.S. WolfY.I. SnirS. KooninE.V. Defense islands in bacterial and archaeal genomes and prediction of novel defense systems.J. Bacteriol.2011193216039605610.1128/JB.05535‑1121908672
    [Google Scholar]
  7. LobbB. TremblayB.J.M. Moreno-HagelsiebG. DoxeyA.C. An assessment of genome annotation coverage across the bacterial tree of life.Microb. Genom.20206311110.1099/mgen.0.00034132124724
    [Google Scholar]
  8. YuG. StoltzfusA. Population diversity of ORFan genes in Escherichia coli. Genome Biol. Evol.20124111176118710.1093/gbe/evs08123034216
    [Google Scholar]
  9. KarthikL. KumarG. KeswaniT. BhattacharyyaA. ChandarS.S. Bhaskara RaoK.V. Protease inhibitors from marine Actinobacteria as a potential source for antimalarial compound.PLoS One201493e9097210.1371/journal.pone.009097224618707
    [Google Scholar]
  10. KalkreuterE. PanG. CepedaA.J. ShenB. Targeting bacterial genomes for natural product discovery.Trends Pharmacol. Sci.2020411132610.1016/j.tips.2019.11.00231822352
    [Google Scholar]
  11. GoodacreN.F. GerloffD.L. UetzP. Protein domains of unknown function are essential in bacteria.MBio201351e00744e1324381303
    [Google Scholar]
  12. Moreno-HagelsiebG. Hudy-YuffaB. Estimating overannotation across prokaryotic genomes using BLAST+, UBLAST, LAST and BLAT.BMC Res. Notes20147165110.1186/1756‑0500‑7‑65125228073
    [Google Scholar]
  13. LoganD.C. Known knowns, known unknowns, unknown unknowns and the propagation of scientific enquiry.J. Exp. Bot.200960371271410.1093/jxb/erp04319269994
    [Google Scholar]
  14. MohanR. VenugopalS. Computational structural and functional analysis of hypothetical proteins of Staphylococcus aureus. Bioinformation201281572272810.6026/9732063000872223055618
    [Google Scholar]
  15. Bharat Siva VarmaP. AdimulamY.B. KodukulaS. In silico functional annotation of a hypothetical protein from Staphylococcus aureus. J. Infect. Public Health20158652653210.1016/j.jiph.2015.03.00726025048
    [Google Scholar]
  16. IslamM.S. ShahikS.M. SohelM. PatwaryN.I.A. HasanM.A. In silico structural and functional annotation of hypothetical proteins of Vibrio cholerae O139.Genomics Inform.2015132535910.5808/GI.2015.13.2.5326175663
    [Google Scholar]
  17. SchoolK. MarklevitzJ. SchramW.K. HarrisL.K. Predictive characterization of hypothetical proteins in Staphylococcus aureus NCTC 8325.Bioinformation201612320922010.6026/9732063001220928149057
    [Google Scholar]
  18. IjaqJ. MalikG. KumarA. DasP.S. MeenaN. BethiN. SundararajanV.S. SuravajhalaP. A model to predict the function of hypothetical proteins through a nine-point classification scoring schema.BMC Bioinformatics20192011410.1186/s12859‑018‑2554‑y
    [Google Scholar]
  19. BianchiM.M. SartoriG. VandenbolM. KaniakA. UccellettiD. MazzoniC. di RagoJ.P. CarignaniG. SlonimskiP.P. FrontaliL. How to bring orphan genes into functional families.Yeast199915651352610.1002/(SICI)1097‑0061(199904)15:6<513::AID‑YEA370>3.0.CO;2‑P10234789
    [Google Scholar]
  20. SuravajhalaP. Hypo, hype and ‘hyp’ human proteins.Bioinformation200721313310.6026/9732063000203118084649
    [Google Scholar]
  21. SuravajhalaP. SundararajanV.S. A classification scoring schema to validate protein interactors.Bioinformation201281343910.6026/9732063000803422359432
    [Google Scholar]
  22. DharP.K. ThwinC. TunK. TsumotoY. Maurer-StrohS. EisenhaberF. SuranaU. Synthesizing non-natural parts from natural genomic template.J. Biol. Eng.200931210.1186/1754‑1611‑3‑219187561
    [Google Scholar]
  23. ReemA. SamiR. KokoM.Y. NomaA.E. AlgabriY.A. KumarR.A. KhojahE. ZhongZ.H. Functional and Structural Annotation of a Hypothetical Protein (PA2373) from Pseudomonas aeruginosa PA01.Int. J. Pharmacol.20211726227010.3923/ijp.2021.262.270
    [Google Scholar]
  24. TsevelkhorolooM. DhakshnamoorthyV. HongY.S. LeeC.R. HongS.K. Bifunctional and monofunctional α-neoagarooligosaccharide hydrolases from Streptomyces coelicolor A3(2).Appl. Microbiol. Biotechnol.2023107123997400810.1007/s00253‑023‑12552‑x37184654
    [Google Scholar]
  25. MouM.J. IslamS.I. MahfujS. In Silico Functional Annotation of VP 128 Hypothetical Protein from Vibrio parahaemolyticus. Aquatic Food Studies20211211210.4194/AFS37
    [Google Scholar]
  26. AregaA.M. DhalA.K. NayakS. MahapatraR.K. In silico and in vitro study of Mycobacterium tuberculosis H37Rv uncharacterized protein (RipD): an insight on tuberculosis therapeutics.J. Mol. Model.202228617110.1007/s00894‑022‑05148‑135624324
    [Google Scholar]
  27. da CostaW.L.O. AraújoC.L.A. DiasL.M. PereiraL.C.S. AlvesJ.T.C. AraújoF.A. FoladorE.L. HenriquesI. SilvaA. FoladorA.R.C. Functional annotation of hypothetical proteins from the Exiguobacterium antarcticum strain B7 reveals proteins involved in adaptation to extreme environments, including high arsenic resistance.PLoS One2018136e019896510.1371/journal.pone.019896529940001
    [Google Scholar]
  28. YangZ. ZengX. TsuiS.K.W. Investigating function roles of hypothetical proteins encoded by the Mycobacterium tuberculosis H37Rv genome.BMC Genomics201920139410.1186/s12864‑019‑5746‑631113361
    [Google Scholar]
  29. GaziM.A. MahmudS. FahimS.M. KibriaM.G. PalitP. IslamM.R. RashidH. DasS. MahfuzM. AhmeedT. Functional prediction of hypothetical proteins from Shigella flexneri and validation of the predicted models by using ROC curve analysis.Genomics Inform.2018164e2610.5808/GI.2018.16.4.e2630602087
    [Google Scholar]
  30. MünterS. WayM. FrischknechtF. Signaling during pathogen infection.Sci. STKE20062006335re510.1126/stke.3352006re516705131
    [Google Scholar]
  31. PuntaM. CoggillP.C. EberhardtR.Y. MistryJ. TateJ. BoursnellC. PangN. ForslundK. CericG. ClementsJ. HegerA. The Pfam protein families database: Nucleic Acids Research.Database (Oxford)2014201240
    [Google Scholar]
  32. YegambaramK. BullochE.M.M. KingstonR.L. Protein domain definition should allow for conditional disorder.Protein Sci.201322111502151810.1002/pro.233623963781
    [Google Scholar]
  33. Marchler-BauerA. LuS. AndersonJ.B. ChitsazF. DerbyshireM.K. DeWeese-ScottC. FongJ.H. GeerL.Y. GeerR.C. GonzalesN.R. GwadzM. HurwitzD.I. JacksonJ.D. KeZ. LanczyckiC.J. LuF. MarchlerG.H. MullokandovM. OmelchenkoM.V. RobertsonC.L. SongJ.S. ThankiN. YamashitaR.A. ZhangD. ZhangN. ZhengC. BryantS.H. CDD: a Conserved Domain Database for the functional annotation of proteins.Nucleic Acids Res.201139DatabaseD225D22910.1093/nar/gkq118921109532
    [Google Scholar]
  34. WangJ. ChitsazF. DerbyshireM.K. GonzalesN.R. GwadzM. LuS. MarchlerG.H. SongJ.S. ThankiN. YamashitaR.A. YangM. ZhangD. ZhengC. LanczyckiC.J. Marchler-BauerA. The conserved domain database in 2023.Nucleic Acids Res.202351D1D384D38810.1093/nar/gkac109636477806
    [Google Scholar]
  35. LuS. WangJ. ChitsazF. DerbyshireM.K. GeerR.C. GonzalesN.R. GwadzM. HurwitzD.I. MarchlerG.H. SongJ.S. ThankiN. YamashitaR.A. YangM. ZhangD. ZhengC. LanczyckiC.J. Marchler-BauerA. CDD/SPARCLE: the conserved domain database in 2020.Nucleic Acids Res.202048D1D265D26810.1093/nar/gkz99131777944
    [Google Scholar]
  36. FinnR.D. CoggillP. EberhardtR.Y. EddyS.R. MistryJ. MitchellA.L. PotterS.C. PuntaM. QureshiM. Sangrador-VegasA. SalazarG.A. TateJ. BatemanA. The Pfam protein families database: towards a more sustainable future.Nucleic Acids Res.201644D1D279D28510.1093/nar/gkv134426673716
    [Google Scholar]
  37. El-GebaliS. MistryJ. BatemanA. EddyS.R. LucianiA. PotterS.C. QureshiM. RichardsonL.J. SalazarG.A. SmartA. SonnhammerE.L.L. HirshL. PaladinL. PiovesanD. TosattoS.C.E. FinnR.D. The Pfam protein families database in 2019.Nucleic Acids Res.201947D1D427D43210.1093/nar/gky99530357350
    [Google Scholar]
  38. SonnhammerE. EddyS.R. BirneyE. BatemanA. DurbinR. Pfam: multiple sequence alignments and HMM-profiles of protein domains.Nucleic Acids Res.199826132032210.1093/nar/26.1.3209399864
    [Google Scholar]
  39. YangW. JiJ. LingS. FangG. A metric and its derived protein similarity network to analyze function-oriented ortholog.bioRxiv2022
    [Google Scholar]
  40. GloverN. DessimozC. EbersbergerI. ForslundS.K. GabaldónT. Huerta-CepasJ. MartinM.J. MuffatoM. PatricioM. PereiraC. da SilvaA.S. WangY. SonnhammerE. ThomasP.D. Advances and applications in the quest for orthologs.Mol. Biol. Evol.201936102157216410.1093/molbev/msz15031241141
    [Google Scholar]
  41. FranceschiniA. SzklarczykD. FrankildS. KuhnM. SimonovicM. RothA. LinJ. MinguezP. BorkP. von MeringC. JensenL.J. STRING v9.1: protein-protein interaction networks, with increased coverage and integration.Nucleic Acids Res.201341Database issueD808D81523203871
    [Google Scholar]
  42. SzklarczykD. GableA.L. LyonD. JungeA. WyderS. Huerta-CepasJ. SimonovicM. DonchevaN.T. MorrisJ.H. BorkP. JensenL.J. MeringC. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.Nucleic Acids Res.201947D1D607D61310.1093/nar/gky113130476243
    [Google Scholar]
  43. SzklarczykD. GableA.L. NastouK.C. LyonD. KirschR. PyysaloS. DonchevaN.T. LegeayM. FangT. BorkP. JensenL.J. von MeringC. The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets.Nucleic Acids Res.202149D1D605D61210.1093/nar/gkaa107433237311
    [Google Scholar]
  44. BhasinM. GargA. RaghavaG.P.S. PSLpred: prediction of subcellular localization of bacterial proteins.Bioinformatics200521102522252410.1093/bioinformatics/bti30915699023
    [Google Scholar]
  45. YuC.S. ChengC.W. SuW.C. ChangK.C. HuangS.W. HwangJ.K. LuC.H. CELLO2GO: a web server for protein subCELlular LOcalization prediction with functional gene ontology annotation.PLoS One201496e9936810.1371/journal.pone.009936824911789
    [Google Scholar]
  46. AbrahimM. MachadoE. Alvarez-ValínF. de MirandaA.B. CatanhoM. Uncovering pseudogenes and intergenic protein-coding sequences in TriTryps’ genomes.Genome Biol. Evol.20221410evac14210.1093/gbe/evac14236208292
    [Google Scholar]
  47. PearsonW.R. An introduction to sequence similarity (“homology”) searching.Curr. Prot. Bioinform.2013423.1.13.1.8
    [Google Scholar]
  48. MahlichY. SteineggerM. RostB. BrombergY. HFSP: high speed homology-driven function annotation of proteins.Bioinformatics20183413i304i31210.1093/bioinformatics/bty26229950013
    [Google Scholar]
  49. RojanoE. JabatoF.M. PerkinsJ.R. Córdoba-CaballeroJ. García-CriadoF. SillitoeI. OrengoC. RaneaJ.A.G. Seoane-ZonjicP. Assigning protein function from domain-function associations using DomFun.BMC Bioinformatics20222314310.1186/s12859‑022‑04565‑635033002
    [Google Scholar]
  50. AltschulS.F. GishW. MillerW. MyersE.W. LipmanD.J. Basic local alignment search tool.J. Mol. Biol.1990215340341010.1016/S0022‑2836(05)80360‑22231712
    [Google Scholar]
  51. XiangZ. Advances in homology protein structure modeling.Curr. Protein Pept. Sci.20067321722710.2174/13892030677745231216787261
    [Google Scholar]
  52. RanganathanS. NakaiK. SchonbachC. Encyclopedia of bioinformatics and computational biology: ABC of Bioinformatics.Elsevier201821
    [Google Scholar]
  53. StormoG.D. An introduction to sequence similarity (“homology”) searching.Curr. prot. Bioinform.2009273.1.13.1.7
    [Google Scholar]
  54. SchnoesA.M. BrownS.D. DodevskiI. BabbittP.C. Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.PLOS Comput. Biol.2009512e100060510.1371/journal.pcbi.100060520011109
    [Google Scholar]
  55. WittenI.H. FrankE. HallM.A. PalC.J. DATA M. Practical machine learning tools and techniques. Data Mining.4th edElsevier Publishers2017
    [Google Scholar]
  56. RaoV.S. SrinivasK. SujiniG.N. KumarG.N.S. Protein-protein interaction detection: methods and analysis.Int. J. Proteomics2014201411210.1155/2014/14764824693427
    [Google Scholar]
  57. AndreevaA. KuleshaE. GoughJ. MurzinA.G. The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures.Nucleic Acids Res.202048D1D376D38210.1093/nar/gkz106431724711
    [Google Scholar]
  58. SillitoeI. LewisT.E. CuffA. DasS. AshfordP. DawsonN.L. FurnhamN. LaskowskiR.A. LeeD. LeesJ.G. LehtinenS. StuderR.A. ThorntonJ. OrengoC.A. CATH: comprehensive structural and functional annotations for genome sequences.Nucleic Acids Res.201543D1D376D38110.1093/nar/gku94725348408
    [Google Scholar]
  59. LetunicI. KhedkarS. BorkP. SMART: recent updates, new developments and status in 2020.Nucleic Acids Res.202149D1D458D46010.1093/nar/gkaa93733104802
    [Google Scholar]
  60. de CastroE. SigristC.J.A. GattikerA. BulliardV. Langendijk-GenevauxP.S. GasteigerE. BairochA. HuloN. ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins.Nucleic Acids Res.200634Web ServerW362W36510.1093/nar/gkl12416845026
    [Google Scholar]
  61. BlairD.E. HekmatO. SchüttelkopfA.W. ShresthaB. TokuyasuK. WithersS.G. van AaltenD.M.F. Structure and mechanism of chitin deacetylase from the fungal pathogen Colletotrichum lindemuthianum. Biochemistry200645319416942610.1021/bi060669416878976
    [Google Scholar]
  62. MurakamiM.T. Freitas Fernandes-PedrosaM. de AndradeS.A. GabdoulkhakovA. BetzelC. TambourgiD.V. ArniR.K. Structural insights into the catalytic mechanism of sphingomyelinases D and evolutionary relationship to glycerophosphodiester phosphodiesterases.Biochem. Biophys. Res. Commun.2006342132332910.1016/j.bbrc.2006.01.12316480957
    [Google Scholar]
  63. TommassenJ. EiglmeierK. ColeS.T. OverduinP. LarsonT.J. BoosW. Characterization of two genes, glpQ and ugpQ, encoding glycerophosphoryl diester phosphodiesterases of Escherichia coli. Mol. Gen. Genet.199122622632132710.1007/BF002736211851953
    [Google Scholar]
  64. EssenL.O. PerisicO. CheungR. KatanM. WilliamsR.L. Crystal structure of a mammalian phosphoinositide-specific phospholipase Cδ.Nature1996380657559560210.1038/380595a08602259
    [Google Scholar]
  65. LenartA. DudkiewiczM. GrynbergM. PawłowskiK. CLCAs - a family of metalloproteases of intriguing phylogenetic distribution and with cases of substituted catalytic sites.PLoS One201385e6227210.1371/journal.pone.006227223671590
    [Google Scholar]
  66. BakolitsaC. BatemanA. JinK.K. McMullanD. KrishnaS.S. MillerM.D. AbdubekP. AcostaC. AstakhovaT. AxelrodH.L. BurraP. CarltonD. ChiuH.J. ClaytonT. DasD. DellerM.C. DuanL. EliasY. FeuerhelmJ. GrantJ.C. GrzechnikA. GrzechnikS.K. HanG.W. JaroszewskiL. KlockH.E. KnuthM.W. KozbialP. KumarA. MarcianoD. MorseA.T. MurphyK.D. NigoghossianE. OkachL. OommachenS. PaulsenJ. ReyesR. RifeC.L. SefcovicN. TienH. TrameC.B. TroutC.V. van den BedemH. WeekesD. WhiteA. XuQ. HodgsonK.O. WooleyJ. ElsligerM.A. DeaconA.M. GodzikA. LesleyS. WilsonI.A. The structure of Jann_2411 (DUF1470) from Jannaschia sp. at 1.45 Å resolution reveals a new fold (the ABATE domain) and suggests its possible role as a transcription regulator.Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun.201066101198120410.1107/S174430910902519620944211
    [Google Scholar]
  67. BowmanA.B. Patel-KingR.S. BenashskiS.E. McCafferyJ.M. GoldsteinL.S.B. KingS.M. Drosophila roadblock and Chlamydomonas LC7: a conserved family of dynein-associated proteins involved in axonal transport, flagellar motility, and mitosis.J. Cell Biol.1999146116518010.1083/jcb.146.1.16510402468
    [Google Scholar]
  68. LavezzoE. FaldaM. FontanaP. BiancoL. ToppoS. Enhancing protein function prediction with taxonomic constraints – The Argot2.5 web server.Methods201693152310.1016/j.ymeth.2015.08.02126318087
    [Google Scholar]
  69. LeeB.Y. HeftaS.A. BrennanP.J. Characterization of the major membrane protein of virulent Mycobacterium tuberculosis. Infect. Immun.19926052066207410.1128/iai.60.5.2066‑2074.19921563797
    [Google Scholar]
  70. DesvauxM. DumasE. ChafseyI. HébraudM. Protein cell surface display in Gram-positive bacteria: from single protein to macromolecular protein structure.FEMS Microbiol. Lett.2006256111510.1111/j.1574‑6968.2006.00122.x16487313
    [Google Scholar]
  71. WalianP.J. AllenS. ShatskyM. ZengL. SzakalE.D. LiuH. HallS.C. FisherS.J. LamB.R. SingerM.E. GellerJ.T. BrennerS.E. ChandoniaJ.M. HazenT.C. WitkowskaH.E. BigginM.D. JapB.K. High-throughput isolation and characterization of untagged membrane protein complexes: outer membrane complexes of Desulfovibrio vulgaris. J. Proteome Res.201211125720573510.1021/pr300548d23098413
    [Google Scholar]
  72. KroghA. LarssonB. von HeijneG. SonnhammerE.L.L. Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen.J. Mol. Biol.2001305356758010.1006/jmbi.2000.431511152613
    [Google Scholar]
  73. YuC.S. ChenY.C. LuC.H. HwangJ.K. Prediction of protein subcellular localization.Proteins200664364365110.1002/prot.2101816752418
    [Google Scholar]
  74. NaqviA.A.T. ShahbaazM. AhmadF. HassanM.I. Identification of Functional Candidates amongst Hypothetical Proteins of Treponema pallidum ssp. pallidum. PLoS One2015104e012417710.1371/journal.pone.012417725894582
    [Google Scholar]
  75. AndradeM.A. O’DonoghueS.I. RostB. Adaptation of protein surfaces to subcellular location 1 1Edited by F. E. Cohen.J. Mol. Biol.1998276251752510.1006/jmbi.1997.14989512720
    [Google Scholar]
  76. NakashimaH. NishikawaK. Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies.J. Mol. Biol.19942381546110.1006/jmbi.1994.12678145256
    [Google Scholar]
  77. KumarS. NigamV.K. Production and characterization of alkaline protease from Halobacillus dabanensis. Indian J. Biotechnol.201716601610
    [Google Scholar]
  78. GuptaR. BegQ. LorenzP. Bacterial alkaline proteases: molecular approaches and industrial applications.Appl. Microbiol. Biotechnol.2002591153210.1007/s00253‑002‑0975‑y12073127
    [Google Scholar]
/content/journals/cp/10.2174/0115701646303687240805072304
Loading
/content/journals/cp/10.2174/0115701646303687240805072304
Loading

Data & Media loading...

Supplements

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test