Skip to content
2000
Volume 17, Issue 1
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Background: The CRISPR system can quickly achieve the editing of different gene loci by changing a small sequence on a single guide RNA. But the off-target event limits the further development of the CRISPR system. How to improve the efficiency and specificity of this technology and minimize the risk of off-target have always been a challenge. For genome-wide CRISPR Off-Target Cleavage Sites (OTS) prediction, an important issue is data imbalance, that is, the number of true OTS identified is much less than that of all possible nucleotide mismatch loci. Methods: In this work, based on the sequence-generating adversarial network (SeqGAN), positive offtarget sequences were generated to amplify the off-target gene locus OTS dataset of Cpf1. Then we trained the data by a deep Convolutional Neural Network (CNN) to obtain a predictor with stronger generalization ability and better performance. Results: In 10-fold cross-validation, the AUC value of the CNN classifier after SeqGAN balance was 0.941, which was higher than that of the original 0.863 and over-sampling 0.929. In independence testing, the AUC value of the CNN classifier after SeqGAN balance was 0.841, which was higher than that of the original 0.833 and over-sampling 0.836. The PR value was 0.722 after SeqGAN, which was also about higher 0.16 than the original data and higher about 0.03 than over-sampling. Conclusion: The sequence generation antagonistic network SeqGAN was firstly used to deal with data imbalance processing on CRISPR data. All the results showed that the SeqGAN can effectively generate positive data for CRISPR off-target sites.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/1574893616666210727162650
2022-01-01
2024-12-27
Loading full text...

Full text loading...

/content/journals/cbio/10.2174/1574893616666210727162650
Loading

  • Article Type:
    Research Article
Keyword(s): CNN; CRISPR; data imbalance; off-target; SeqGAN; single-guide RNA
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test