Skip to content
2000
Volume 20, Issue 3
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Introduction

Identifying and predicting protein-coding regions within DNA sequences play a pivotal role in genomic research. This paper introduces an approach for identifying protein-coding regions in DNA sequences by employing a hybrid methodology that combines digital bandpass filtering with wavelet transform and various spectral estimation techniques to enhance exon prediction. Specifically, the Haar and Daubechies wavelet transforms are applied to improve the accuracy of protein-coding region (exon) prediction, enabling the extraction of intricate details that may be obscured in the original DNA sequences.

Methods

This research work showcases the utility of Haar and Daubechies wavelet transforms, both non-parametric and parametric spectral estimation techniques, and the deployment of a digital bandpass filter for detecting peaks in exon regions. Additionally, the application of the Electron-Ion Interaction Potential (EIIP) method for converting symbolic DNA sequences into numerical values and the utilization of Sum-of-Sinusoids (SoS) mathematical model with optimized parameters further enrich the toolbox for DNA sequence analysis, ensuring the success of the proposed approach in modeling DNA sequences, optimally, and accurately identifying genes.

Results

The outcomes of this approach showcase a substantial enhancement in identification accuracy for protein-coding regions. In terms of peak location detection, the application of Haar and Daubechies wavelet transforms enhances the accuracy of peak localization by approximately (0.01, 3-5 dB). When employing non-parametric and parametric spectral estimation techniques, there is an improvement in peak localization by approximately (0.01, 4 dB) compared to the original signal. The proposed approach also achieves higher accuracy, when compared with existing ones.

Conclusion

These findings not only bridge gaps in DNA sequence analysis but also offer a promising pathway for advancing exonic region prediction and gene identification in genomics research. The hybrid methodology presented stands as a robust contribution to the evolving landscape of genomic analysis techniques.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/0115748936287244240117065325
2024-01-30
2025-05-09
Loading full text...

Full text loading...

References

  1. YadavY. SharmaSN. ShakyaD.K. Detection of tandem repeats in dna sequences using short-time ramanujan fourier transform.IEEE/ACM Trans Comput Biol Bioinform.202219315839110.1109/TCBB.2021.3053656
    [Google Scholar]
  2. MohammedK.B. BoyapatiS.V. KandimallaM.D. KavatiM.B. SaletiS. A comparative analysis of the evolution of dna sequencing techniques along with the accuracy prediction of a sample dna sequence dataset using machine learning.2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)Nagpur, India20231510.1109/PCEMS58491.2023.10136116
    [Google Scholar]
  3. KhaQ.H. HoQ.T. LeN.Q.K. Identifying SNARE proteins using an alignment-free method based on multiscan convolutional neural network and PSSM profiles.J. Chem. Inf. Model.202262194820482610.1021/acs.jcim.2c0103436166351
    [Google Scholar]
  4. LeN.Q.K. XuL. Optimizing hyperparameter tuning in machine learning to improve the predictive performance of cross-species n6-methyladenosine sites.ACS Omega2023842394203942610.1021/acsomega.3c0507437901522
    [Google Scholar]
  5. ZhangZ. GongY. GaoB. LiH. GaoW. ZhaoY. DongB. SNAREs-SAP: SNARE proteins identification with PSSM profiles.Front. Genet.20211280900110.3389/fgene.2021.80900134987554
    [Google Scholar]
  6. LeN.Q.K. HuynhT.T. YappE.K.Y. YehH.Y. Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles.Comput. Methods Programs Biomed.2019177818810.1016/j.cmpb.2019.05.01631319963
    [Google Scholar]
  7. LeN.Q.K. NguyenV.N. SNARE-CNN: A 2D convolutional neural network architecture to identify SNARE proteins from high-throughput sequencing data.PeerJ Comput. Sci.20195e17710.7717/peerj‑cs.17733816830
    [Google Scholar]
  8. GaoW. XuD. LiH. DuJ. WangG. LiD. Identification of adaptor proteins by incorporating deep learning and PSSM profiles.Methods2023209101710.1016/j.ymeth.2022.11.00136427763
    [Google Scholar]
  9. SolankiA. GriffinZ. SutradharP.R. PradhanK. MerrittC. GangulyA. RiedelM. Neural network execution using nicked DNA and microfluidics.PLoS One20231810e029222810.1371/journal.pone.029222837856428
    [Google Scholar]
  10. WangA. MengQ. WangM. Spectrum sensing method based on residual dense network and attention.Sensors20232318779110.3390/s2318779137765847
    [Google Scholar]
  11. WuQ. ChangY. YangC. LiuH. ChenF. DongH. ChenC. LuoQ. Adjuvant chemotherapy or no adjuvant chemotherapy? A prediction model for the risk stratification of recurrence or metastasis of nasopharyngeal carcinoma combining MRI radiomics with clinical factors.PLoS One2023189e028703110.1371/journal.pone.028703137751422
    [Google Scholar]
  12. ChenD. WangR. JiangY. XingZ. ShengQ. LiuX. WangR. XieH. ZhaoL. Application of artificial neural network in daily prediction of bleeding in ICU patients treated with anti-thrombotic therapy.BMC Med. Inform. Decis. Mak.202323117110.1186/s12911‑023‑02274‑537653495
    [Google Scholar]
  13. Velázquez-EnríquezJ.M. Reyes-AvendañoI. Santos-ÁlvarezJ.C. Reyes-JiménezE. Vásquez-GarzónV.R. Baltiérrez-HoyosR. Identification of hub genes in idiopathic pulmonary fibrosis and their association with lung cancer by bioinformatics analysis.Adv. Respir. Med.202391540743110.3390/arm9105003237887075
    [Google Scholar]
  14. WaniM.Y. GanieN.A. RaniS. MehrajS. MirM.R. BaqualM.F. SahafK.A. MalikF.A. DarK.A. Advances and applications of Bioinformatics in various fields of lifeInt. J. Fauna Biol.201852310
    [Google Scholar]
  15. BaykalP.I. BeerenwinkelN. MangulS. Reproducibility of bioinformatics tools2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)Lyon, France202221621610.1109/IPDPSW55747.2022.00046
    [Google Scholar]
  16. DinizW.J. CanduriF. Bioinformatics: an overview and its applications.Genet Mol Res.2017161121
    [Google Scholar]
  17. BrancoI. ChoupinaA. Bioinformatics: new tools and applications in life science and personalized medicine.Appl. Microbiol. Biotechnol.2021105393795110.1007/s00253‑020‑11056‑233404829
    [Google Scholar]
  18. SajidA . Stefan Marhon KremerC. Prediction of Protein Coding Regions Using a Wide-Range Wavelet Window MethodIEEE/ACM Trans Comput Biol Bioinform.2016134
    [Google Scholar]
  19. DessoukyA.M. TahaT.E. DessoukyM.M. EltholthA.A. HassanE. Abd El-SamieF.E. Non-parametric spectral estimation techniques for DNA sequence analysis and exon region prediction.Comput. Electr. Eng.20197333434810.1016/j.compeleceng.2018.12.001
    [Google Scholar]
  20. SornsenC.S. KitpaiboontaweeR. Partial Discharge Signal Detection in Generators Using Wavelet Transforms2021 International Conference on Power, Energy and Innovations (ICPEI)Nakhon Ratchasima, Thailand202119519810.1109/ICPEI52436.2021.9690682
    [Google Scholar]
  21. MehendaleT. RaminaV. PingeS. KulkarniS. Analysis of the effects of different types of noises and wavelets used in denoising of an image using wavelet transform.2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT)Kharagpur, India20201510.1109/ICCCNT49239.2020.9225629
    [Google Scholar]
  22. RamyaS. UmaM. Evaluation of wavelet transformed features on detection of epileptic seizures using 2d scalogram images of eeg signals.2023 12th International Conference on Advanced Computing (ICoAC).Chennai, India20231610.1109/ICoAC59537.2023.10249921
    [Google Scholar]
  23. PantS. NemaR.K. GuptaS. Detecting faults in power transformers using wavelet transform.2021 IEEE 2nd International Conference On Electrical Power and Energy Systems (ICEPES). Bhopal, India202110.1109/ICEPES52894.2021.9699483
    [Google Scholar]
  24. PriyaP. PriyM.V. ChokkkattuJ. Implementation of schizophrenia diagnosis with eeg signal using stationary wavelet transform and linear wavelet transform algorithm.2021 IEEE 2nd International Conference On Electrical Power and Energy Systems (ICEPES), Bhopal, India20221510.1109/MACS56771.2022.10022917
    [Google Scholar]
  25. DaiW. RenX. Defogging algorithm for road environment landscape visual image based on wavelet transform.2023 International Conference on Networking, Informatics and Computing (ICNETIC)Palermo, Italy202358759110.1109/ICNETIC59568.2023.00127
    [Google Scholar]
  26. KavithaA. PriyankaR. Analysis of novel face recognition system to minimize the false identification rate using fast fourier transform in comparison with wavelet transform.2022 14th International Conference on Mathematics, Actuarial Science, Computer Science and Statistics (MACS)Karachi, Pakistan20221510.1109/MACS56771.2022.10022744
    [Google Scholar]
  27. AksenovichT. V. Comparison of the Use of Wavelet Transform and Short-Time Fourier Transform for the Study of Geomagnetically Induced Current in the Autotransformer Neutral2020 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon)Vladivostok, Russia20201510.1109/FarEastCon50210.2020.9271210
    [Google Scholar]
  28. TakanoD. MinamotoT. Feature extraction method for early-stage colorectal cancer using dual-tree complex wavelet packet transform.2021 International Conference on Wavelet Analysis and Pattern Recognition (ICWAPR)Adelaide, Australia20211410.1109/ICWAPR54887.2021.9736150
    [Google Scholar]
  29. KhokerM.Z. Prakash MahelaO. AhmadG. A voltage algorithm using discrete wavelet transform and hilbert transform for detection and classification of power system faults in the presence of solar energy.Conference on Electrical,Electronics and Computer Science (SCEECS)Bhopal, India20201410.1109/SCEECS48394.2020.7
    [Google Scholar]
  30. JesusP. ChalcoMena. CarrerHelaine . Zana Yossi . MRoberto . JrCesar . Identification of protein coding regions using the modified gabor- wavelet transform.IEEE/ACM Ttrans Comput Biol Bioinform.20085219820710.1109/TCBB.2007.7025918451429
    [Google Scholar]
  31. AhmedM Dessouky TahaE Mohamed MDessouky et al. Visual representation of DNA sequences for exon detection using non-parametric spectral estimation techniques.Nucleosides Nucleotides Nucleic Acids201938511710.1080/15257770.2018.1536270
    [Google Scholar]
  32. AhmedM. Dessouky, Fathi E, Abd El-Samie, Hesham Fathi, Salama Gerges M. Statistical DNA sequence modeling and exon detection using non-parametric methods.Int J Comput Digit Syst202010.12785/ijcds/090406
    [Google Scholar]
  33. DasL. DasJ.K. NandaS. Detection of exon location in eukaryotic DNA using a fuzzy adaptive Gabor wavelet transform.Genomics202011264406441610.1016/j.ygeno.2020.07.02032717319
    [Google Scholar]
  34. InbamalarT.M. SivakumarR. Improved algorithm for analysis of DNA sequences using multiresolution transformation.Scientific World J.201520151910.1155/2015/78649726000337
    [Google Scholar]
  35. DasL. NandaS. DasJ.K. An integrated approach for identification of exon locations using recursive gauss newton tuned adaptive Kaiser window.Genomics2019111328429610.1016/j.ygeno.2018.10.00830342085
    [Google Scholar]
  36. ZhangX. PanW. Exon prediction based on multiscale products of a genomic-inspired multiscale bilateral filtering.PLoS One2019143e020505010.1371/journal.pone.020505030897105
    [Google Scholar]
  37. YuJ. GuoL. DouX. JiangW. QianB. LiuJ. WangJ. WangC. XuC. Comprehensive evaluation of protein-coding sORFs prediction based on a random sequence strategy.Frontiers in Bioscience-Landmark202126827227810.52586/494334455759
    [Google Scholar]
  38. LuoX. ChiW. DengM. Deepprune: Learning efficient and interpretable convolutional networks through weight pruning for predicting DNA-protein binding.Front. Genet.201910114510.3389/fgene.2019.0114531824562
    [Google Scholar]
  39. LuoX. TuX. DingY. GaoG. DengM. Expectation pooling: An effective and interpretable pooling method for predicting DNA–protein binding.Bioinformatics20203651405141210.1093/bioinformatics/btz76831598637
    [Google Scholar]
  40. ZhangY. QiaoS. JiS. LiY. DeepSite: Bidirectional LSTM and CNN models for predicting DNA–protein binding.Int. J. Mach. Learn. Cybern.202011484185110.1007/s13042‑019‑00990‑x
    [Google Scholar]
  41. DuZ. XiaoX. UverskyV.N. Classification of chromosomal DNA sequences using hybrid deep learning architectures.Curr. Bioinform.202115101130113610.2174/1574893615666200224095531
    [Google Scholar]
  42. ZhangY. GaoY. NiJ. ChenP. WangX. FermatS. FermatS: A novel numerical representation for protein sequence comparison and DNA-binding protein identification.Comb. Chem. High Throughput Screen.202124101746175310.2174/138620732399920111711173833208064
    [Google Scholar]
  43. KumarS. AgarwalS. Ranvijay An efficient tool for searching maximal and super maximal repeats in large dna/protein sequences via induced-enhanced suffix array.Recent Pat. Comput. Sci.201912212813410.2174/2213275911666181107095645
    [Google Scholar]
  44. DingY. ChenF. GuoX. TangJ. WuH. Identification of DNA-binding proteins by multiple kernel support vector machine and sequence information.Curr. Proteomics202017430231010.2174/1570164616666190417100509
    [Google Scholar]
  45. SinghH.D. SainiM. KaurJ. Fetal distress classification with deep convolutional neural network.Curr. Womens Health Rev.2021171607310.2174/1573404816999200821162312
    [Google Scholar]
  46. GuoL. JiangQ. JinX. LiuL. ZhouW. YaoS. WuM. WangY. A deep convolutional neural network to improve the prediction of protein secondary structure.Curr. Bioinform.202015776777710.2174/1574893615666200120103050
    [Google Scholar]
  47. NizarM.H.A. ChanC.K. KhalilA. YusofA.K.M. LaiK.W. Real-time detection of aortic valve in echocardiography using convolutional neural networks.Curr. Med. Imaging Rev.202016558459110.2174/1573405615666190114151255
    [Google Scholar]
  48. JohnC. SahooJ. MadhavanM. MathewO.K. Convolutional neural networks: A promising deep learning architecture for biological sequence analysis.Curr. Bioinform.202318753755810.2174/1574893618666230320103421
    [Google Scholar]
  49. Al-ShourbajiI. DuraibiS. IWQP4Net: An efficient convolution neural network for irrigation water quality prediction.Water2023159165710.3390/w15091657
    [Google Scholar]
  50. AsiriA. AlnemerM. BhattiM.I. Interconnectedness of cryptocurrency uncertainty indices with returns and volatility in financial assets during COVID-19.J. Risk Financ.2023161042810.3390/jrfm16100428
    [Google Scholar]
  51. OteefM.D.Y. OtaifK.D. IdrisA.M. Personal protective equipment as a potential source of phthalate exposure during the COVID-19 pandemic.Appl. Sci.20231316907610.3390/app13169076
    [Google Scholar]
  52. AlameerA. MaslamaniY. GosadiI.M. ElaminM.Y. MuaddiM.A. AlqassimA.Y. DoweriA. NamisI. BusayliF. AhmadiniH. HejriY. DahlanA. Assessing continuity of adherence to precautionary measures for covid-19 among vaccinated people in Jazan, Saudi Arabia.Microorganisms202311380010.3390/microorganisms1103080036985372
    [Google Scholar]
  53. ElaidyS.M. El-KherbetawyM.K. AbedS.Y. AlattarA. AlshamanR. EladlM.A. AlamriE.S. Al balawiA.N. ZaidA. ElkazzazA.Y. AbdelkhaligS.M. HamedZ.E. ZaitoneS.A. α-hederin saponin augments the chemopreventive effect of cisplatin against ehrlich tumors and bioinformatic approach identifying the role of SDF1/CXCR4/p-AKT-1/NFκB signaling.Pharmaceuticals202316340510.3390/ph1603040536986504
    [Google Scholar]
  54. LalwaniA.K. KrishnanK. BagabirS.A. AlkhananiM.F. AlmalkiA.H. HaqueS. SharmaS.K. SinghR.K.B. MalikM.Z. Network theoretical approach to explore factors affecting signal propagation and stability in dementia’s protein-protein interaction network.Biomolecules202212345110.3390/biom1203045135327643
    [Google Scholar]
  55. TutsoyO. TanrikuluM.Y. Priority and age specific vaccination algorithm for the pandemic diseases: A comprehensive parametric prediction model.BMC Med. Inform. Decis. Mak.2022221410.1186/s12911‑021‑01720‑634991566
    [Google Scholar]
  56. Ahmed MDessouky Abd El-SamieFathi E FathiHesham SalamaGerges M. Efficient implementation of parametric spectral estimation techniques for DNA exon prediction.Nucleosides Nucleotides Nucleic Acids20203981200122110.1080/15257770.2020.1780442
    [Google Scholar]
  57. LibalU. JohanssonK.H. Yule-walker equations using higher order statistics for nonlinear autoregressive model,2019 Signal Processing Symposium (SPSympo)Krakow, Poland201922723110.1109/SPS.2019.8882057
    [Google Scholar]
  58. FiratU. AkgülT. Spectral estimation of cavitation related narrow-band ship radiated noise based on fractional lower order statistics and multiple signal classification.In: 2013 OCEANS - San DiegoSan Diego, CA, USA201316
    [Google Scholar]
  59. LiuJ. ShaoY. QinX. LuX. Inter-harmonics parameter detection based on interpolation FFT and multiple signal classification algorithm.2019 Chinese Control And Decision Conference (CCDC)Nanchang, China20194691469610.1109/CCDC.2019.8833276
    [Google Scholar]
  60. Sanja rogicAvailable from: https://srogic.wordpress.com/datasets/hmr195-dataset/ [accessed date: 13/10/2023]
  61. El-BadawyI.M. GasserS. AzizA.M. On the use of pseudo-eiip mapping scheme for identifying exons locations in dna sequencesIEEE International Conference on Signal and Image Processing Applications (ICSIPA)201510.1109/ICSIPA.2015.7412197
    [Google Scholar]
  62. WassfyH.M. SalemM.L. AbdelnabyM.M. MabroukM.S. ZidanA.A. Advanced DNA mapping schemes for exon prediction using digital filters.Am. J. Biomed. Eng.201662531
    [Google Scholar]
  63. JungwirthP. CroweW.M. Continuous time digital signal processing and signal reconstruction.2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC)Las Vegas, NV, USA20231205121110.1109/CCWC57344.2023.10099213
    [Google Scholar]
  64. SomanK. AnandN. NambiarA. DaniA. A comparative study on Wavelet Transform-based algorithm for calculating Heart Rate from Ballistocardiography.2023 15th International Conference on COMmunication Systems & NETworkS (COMSNETS)Bangalore, India2023909510.1109/COMSNETS56262.2023.10041401
    [Google Scholar]
  65. AhmedM. Dessouky Abd El-SamieFathi E. Fathi Hesham SalamaGerges M Optimum model selection and statistical analysis for DNA sequences.Nucleosides, Nucleotides Nucleic Acids202140880882010.1080/15257770.2021.1951755
    [Google Scholar]
  66. KanungoA. MittalM. DewanL. Comparison of haar and daubechies wavelet based denoising for speed control of DC motor.2020 First IEEE International Conference on Measurement, Instrumentation, Control and Automation (ICMICA)Kurukshetra, India20201410.1109/ICMICA48462.2020.9242877
    [Google Scholar]
  67. TelagamN. SomanaiduU. ArunkumarM. SabarimuthuM. KandasamyN. BER analysis of LTE-OFDM based DWT, haar transform and singular wavelet decomposition in stanford university interim channel.2020 Third International Conference on Advances in Electronics, Computers and Communications (ICAECC)Bengaluru, India20201510.1109/ICAECC50550.2020.9339476
    [Google Scholar]
  68. HamadiH. HaninT. N. N. ZhafiraN. M. Z. Muhtadan Denoising image of mammography using wavelet methods.2022 8th International Conference on Science and Technology (ICST)Yogyakarta, Indonesia20221610.1109/ICST56971.2022.10136301
    [Google Scholar]
  69. ElmasryR.M. M. SalemM. A. FahmyO. M. El GhanyM. A. Image enhancement using recursive anisotropic and stationary wavelet transform.2023 30th International Conference on Systems, Signals and Image Processing (IWSSIP)Ohrid, North Macedonia20231510.1109/IWSSIP58668.2023.10180278
    [Google Scholar]
  70. SriwichaiK. Sam-angP. KaennakhamS. A numerical investigation of various forms of wavelet in financial time series analysis.2021 International Conference on Electrical, Computer and Energy Technologies (ICECET)Cape Town, South Africa20211610.1109/ICECET52533.2021.9698629
    [Google Scholar]
  71. MR.K. VaegaeN.K. Walsh code based numerical mapping method for the identification of protein coding regions in eukaryotes.Biomed. Signal Process. Control20205810185910.1016/j.bspc.2020.101859
    [Google Scholar]
  72. SharmaS.D. SharmaS.N. SaxenaR. Identification of short exons disunited by a short intron in eukaryotic DNA regions.IEEE/ACM Trans. Comput. Biol. Bioinforma201910.1109/TCBB.2019.2900040
    [Google Scholar]
/content/journals/cbio/10.2174/0115748936287244240117065325
Loading
/content/journals/cbio/10.2174/0115748936287244240117065325
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test