Skip to content
2000
Volume 22, Issue 5
  • ISSN: 1389-2029
  • E-ISSN: 1875-5488

Abstract

Background: Splice junctions are the key to move from pre-messenger RNA to mature messenger RNA in many multi-exon genes due to alternative splicing. Since the percentage of multi- exon genes that undergo alternative splicing is very high, identifying splice junctions is an attractive research topic with important implications. Objective: The aim of this paper is to develop a deep learning model capable of identifying splice junctions in RNA sequences using 13,666 unique sequences of primate RNA. Methods: A Long Short-Term Memory (LSTM) Neural Network model is developed that classifies a given sequence as EI (Exon-Intron splice), IE (Intron-Exon splice), or N (No splice). The model is trained with groups of trinucleotides and its performance is tested using validation and test data to prevent bias. Results: Model performance was measured using accuracy and f-score in test data. The finalized model achieved an average accuracy of 91.34% with an average f-score of 91.36% over 50 runs. Conclusion: Comparisons show a highly competitive model to recent Convolutional Neural Network structures. The proposed LSTM model achieves the highest accuracy and f-score among published alternative LSTM structures.

Loading

Article metrics loading...

/content/journals/cg/10.2174/1389202922666211011143008
2021-08-01
2025-06-22
Loading full text...

Full text loading...

/content/journals/cg/10.2174/1389202922666211011143008
Loading

  • Article Type:
    Research Article
Keyword(s): classification; deep learning; LSTM; neural networks; RNA-seq; Splice junction
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test