Skip to content
2000
Volume 19, Issue 4
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Promoters are DNA fragments located near the transcription initiation site, they can be divided into strong promoter type and weak promoter type according to transcriptional activation and expression level. Identifying promoters and their strengths in DNA sequences is essential for understanding gene expression regulation. Therefore, it is crucial to further improve predictive quality of predictors for real-world application requirements. Here, we constructed the latest training dataset based on the RegalonDB website, where all the promoters in this dataset have been experimentally validated, and their sequence similarity is less than 85%. We used one-hot and nucleotide chemical property and density (NCPD) to represent DNA sequence samples. Additionally, we proposed an ensemble deep learning framework containing a multi-head attention module, long short-term memory present, and a convolutional neural network module. The results showed that iPSI(2L)-EDL outperformed other existing methods for both promoter prediction and identification of strong promoter type and weak promoter type, the AUC and MCC for the iPSI(2L)-EDL in identifying promoter were improved by 2.23% and 2.96% compared to that of PseDNC-DL on independent testing data, respectively, while the AUC and MCC for the iPSI(2L)- EDL were increased by 3.74% and 5.86% in predicting promoter strength type, respectively. The results of ablation experiments indicate that CNN plays a crucial role in recognizing promoters, the importance of different input positions and long-range dependency relationships among features are helpful for recognizing promoters. Furthermore, to make it easier for most experimental scientists to get the results they need, a userfriendly web server has been established and can be accessed at http://47.94.248.117/IPSW(2L)-EDL.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/0115748936264316230926073231
2024-05-01
2025-01-10
Loading full text...

Full text loading...

/content/journals/cbio/10.2174/0115748936264316230926073231
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test