
Full text loading...
Background: A very popular technique for isolating significant genes from cancerous tissues is the application of various clustering algorithms on data obtained by DNA microarray experiments. Aim: The objective of the present work is to take into consideration the chromosomal identity of every gene before the clustering, by creating a three-dimensional structure of the form Chromosomes×Genes×Samples. Further on, the k-Means algorithm and a triclustering technique called δ- TRIMAX, are applied independently on the structure. Materials and Methods: The present algorithm was developed using the Python programming language (v. 3.5.1). For this work, we used two distinct public datasets containing healthy control samples and tissue samples from bladder cancer patients. Background correction was performed by subtracting the median global background from the median local Background from the signal intensity. The quantile normalization method has been applied for sample normalization. Three known algorithms have been applied for testing the “gene cube”, a classical k-means, a transformed 3D k-means and the δ-TRIMAX. Results: Our proposed data structure consists of a 3D matrix of the form Chromosomes×Genes×Samples. Clustering analysis of that structure manifested very good results as we were able to identify gene expression patterns among samples, genes and chromosomes. Discussion: to the best of our knowledge, this is the first time that such a structure is reported and it consists of a useful tool towards gene classification from high-throughput gene expression experiments. Conclusions: Such approaches could prove useful towards the understanding of disease mechanics and tumors in particular.