Skip to content
2000
Volume 18, Issue 4
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Background: Non-model species lacking public genomic resources have an extra handicap in bioinformatics that could be assisted by parameter tuning and the use of alternative software. Indeed, for RNA-seq-based gene differential expression analysis, parameter tuning could have a strong impact on the final results that should be evaluated. However, the lack of gold-standard datasets with known expression patterns hampers robust evaluation of pipelines and parameter combinations. Objective: The aim of the presented workflow is to assess the best differential expression analysis pipeline among several alternatives, in terms of accuracy. To achieve this objective, an automatic procedure of gold-standard construction for simulation-based benchmarking is implemented. Methods: The workflow, which is divided into four steps, simulates read libraries with known expression values to enable the construction of gold-standards for benchmarking pipelines in terms of true and false positives. We validated the workflow with a case study consisting of real RNA-seq libraries of radiata pine, a forest tree species with no publicly available reference genome. Results: The workflow is available as a freeware application (DEGoldS) consisting on sequential Bash and R scripts that can run in any UNIX OS platform. The presented workflow proved to be able to construct a valid gold-standard from real count data. Additionally, benchmarking showed that slight pipeline modifications produced remarkable differences in the outcome of differential expression analysis. Conclusion: The presented workflow solves the issues associated with robust gold-standard construction for benchmarking in differential expression experiments and can accommodate with a wide range of pipelines and parameter combinations.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/1574893618666230222122054
2023-05-01
2025-06-23
Loading full text...

Full text loading...

/content/journals/cbio/10.2174/1574893618666230222122054
Loading

  • Article Type:
    Research Article
Keyword(s): Assembly; benchmarking; non-model species; RNA-seq; simulation; transcriptome
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test