Skip to content
2000
Volume 18, Issue 1
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Background: In single-cell RNA-seq data, clustering methods are employed to identify cell types to understand cell-differentiation and development. Because clustering methods are sensitive to the high dimensionality of single-cell RNA-seq data, one effective solution is to select a subset of genes in order to reduce the dimensionality. Numerous methods, with different underlying assumptions, have been proposed for choosing a subset of genes to be used for clustering. Objective: To guide users in selecting suitable gene selection methods, we give an overview of different gene selection methods and compare their performance in terms of the differences between the selected gene sets, clustering performance, running time, and stability. Results: We first review the data preprocessing strategies and gene selection methods in analyzing single-cell RNA-seq data. Then, the overlaps among the gene sets selected by different methods are analyzed and the clustering performance based on different feature gene sets is compared. The analysis reveals that the gene sets selected by the methods based on highly variable genes and high mean genes are most similar, and the highly variable genes play an important role in clustering. Additionally, a small number of selected genes would compromise the clustering performance, such as SCMarker selected fewer genes than other methods, leading to a poorer clustering performance than M3Drop. Conclusion: Different gene selection methods perform differently in different scenarios. HVG works well on the full-transcript sequencing datasets, NBDrop and HMG perform better on the 3’ end sequencing datasets, M3Drop and HMG are more suitable for big datasets, and SCMarker is most consistent in different preprocessing methods.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/1574893618666221103114320
2023-01-01
2025-06-22
Loading full text...

Full text loading...

/content/journals/cbio/10.2174/1574893618666221103114320
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test