- Home
- A-Z Publications
- Current Bioinformatics
- Previous Issues
- Volume 17, Issue 6, 2022
Current Bioinformatics - Volume 17, Issue 6, 2022
Volume 17, Issue 6, 2022
-
-
Graph Neural Networks in Biomedical Data: A Review
Authors: You Li, Guiyang Zhang, Pan Wang, Zuo-Guo Yu and Guohua HuangWith the development of sequencing technology, various forms of biomedical data, including genomics, transcriptomics, proteomics, microbiomics, and metabolomics data, are increasingly emerging. These data are an external manifestation of cell activity and mechanism. How to deeply analyze these data is critical to uncovering and understanding the nature of life. Due to the heterogeneousness and complexity of these data, it is a vastly challenging task for traditional machine learning to deal with it. Over the recent ten years, a new machine learning framework called graph neural networks (GNNs) has been proposed. The graph is a very powerful tool to represent a complex system. The GNNs is becoming a key to open the mysterious door of life. In this paper, we focused on summarizing state-ofthe- art GNNs algorithms (GraphSAGE, graph convolutional network, graph attention network, graph isomorphism network and graph auto-encoder), briefly introducing the main principles behind them. We also reviewed some applications of the GNNs to the area of biomedicine, and finally discussed the possible developing direction of GNNs in the future.
-
-
-
Structural Biology Meets Biomolecular Networks: The Post-AlphaFold Era
Authors: Wenying Yan and Guang HuBackground: Recent progress in protein structure prediction by AlphaFold has opened new avenues to decipher biological functions from the perspective of structural biology based on the proteomics level. Methods: To meet these challenges, in this perspective, three scales of networks for protein structures, including structural protein-protein networks, protein structural networks, and elastic network models were introduced for high-throughput modeling of protein functional sites and protein dynamics. Conclusion: In the post-AlphaFold era, it is assumed that the integration of biomolecular networks may be leveraged in the future to develop a modeling framework that addresses protein structure-based functions with the application in drug discovery.
-
-
-
Identifying Biomarkers of Cisplatin Sensitivity in Non-Small Cell Lung Cancer via Comprehensive Integrative Analysis
Authors: Xin-Ping Xie, Wulin Yang, Lei Zhang and Hong-Qiang WangBackground: Only 30-40% of non-small cell lung cancer (NSCLC) patients are clinically sensitive to cisplatin-based chemotherapy. Thus, it is necessary to identify biomarkers for personalized cisplatin chemotherapy in NSCLC. However, data heterogeneity and low-value density make it challenging to detect reliable cisplatin efficacy biomarkers using traditional analysis methods. Objective: This paper aims to find reliable cisplatin efficacy biomarkers for NSCLC patients using comprehensive integrative analysis. Methods: We searched online resources and collected six NSCLC transcriptomics data sets with responses to cisplatin. The six data sets are divided into two groups: the learning group for biomarker identification and the test group for independent validation. We performed comprehensive integrative analysis under two kinds of frameworks, i.e., one-level and two-level, with three integrative models. Pathway analysis was performed to estimate the biological significance of the resulting biomarkers. For independent validation, logrank statistic was employed to test how significant the difference of Kaplan- Meier (KM) curves between two patient groups is, and the Cox proportional-hazards model was used to test how the expression of a gene is associated with patients’ survival time. Especially, a permutation test was performed to verify the predictive power of a biomarker panel on cisplatin efficacy. For comparison, we also analyzed each learning data set individually, in which three popular differential expression models, Limma, SAM, and RankSum, were used. Results: A total of 318 genes were identified as a core panel of cisplatin efficacy markers for NSCLC patients, exhibiting consistent differential expression between cisplatin-sensitive and –resistant groups across studies. A total of 129 of 344 KEGG pathways were found to be enriched in the core panel, reflecting a picture of the molecular mechanism of cisplatin resistance in NSCLC. By mapping onto the KEGG pathway tree, we found that a KEGG pathway-level I module, genetic information processing, is most active in the core panel with the highest activity ratio in response to cisplatin in NSCLC as expected. Related pathways include mismatch repair, nucleotide excision repair, aminoacyl-tRNA biosynthesis, and basal transcription factors, most of which respond to DNA double-strand damage in patients. Evaluation on two independent data sets demonstrated the predictive power of the core marker panel for cisplatin sensitivity in NSCLC. Also, some single markers, e.g., MST1R, were observed to be remarkably predictive of cisplatin resistance in NSCLC. Conclusion: Integrative analysis is more powerful in detecting biomarkers for cisplatin efficacy by overcoming data heterogeneity and low-value density in data sets, and the identified core panel (318 genes) can help develop personalized medicine of cisplatin chemotherapy for NSCLC patients.
-
-
-
Comprehensive Analysis of the Differentially Expressed Transcriptome with ceRNA Networks in a Mouse Model of Liver Cirrhosis
Authors: Yichi Zhang, Xinsheng Nie, Yanan Jiang, Lijuan Wang, Zhuzhi Wan, Hao Jin, Ronghui Pu, Meihui Liang, Hailong Zhang, Qi Liu, Yuan Chang, Yang Gao, Ningning Yang and Shizhu JinBackground: Hepatic cirrhosis is the consequence of various chronic liver diseases for which there is no curative treatment. In this study, based on RNA sequencing (RNA-seq) and subsequent bioinformatic analysis, we aim to explore the biological function of non-coding RNAs (ncRNAs) in hepatic cirrhosis. Methods: The hepatic cirrhosis models were induced by the intraperitoneal injection of carbon tetrachloride (CCl4). The transcriptome profile was acquired by RNA-seq, the results of which were verified by quantitative real-time PCR (qRT-PCR). The competing endogenous RNA (ceRNA) networks were visualized by Cytoscape software. The enrichment analyses of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were conducted. Results: The differentially expressed transcript of liver cirrhosis consists of 2369 mRNAs, 374 lncRNAs, 91 circRNAs, and 242 miRNAs (|log2(fold change)|≥1 and P<0.05). The RNA-seq results were highly consistent with qRT-PCR validation of DEGs (four upregulated and four down-regulated, including ENSMUSG00000047517, ENSMUST00000217449, novel-circ-001366, miR-383-5p, ENSMUSG00000078683, ENSMUST00000148206, novel-circ-001986 and miR-216a-5p). Based on ceRNA theory, a circRNA-lncRNA co-regulated ceRNA network was established. Enrichment analysis revealed the potential key regulatory process during the liver cirrhosis progression. Conclusion: In conclusion, the present study comprehensively analyzed differentially expressed transcripts in CCl4-induced liver cirrhosis. Our findings explored the gene signatures for liver cirrhosis’s diagnosis and precise treatment.
-
-
-
A Simple and Practical microRNA-based Nomogram to Predict Metastatic HCC
Authors: Yong Zhu, Yusheng Jie, Yuankai Wu, Wenting Tang, Jing Cao, Zhongzhen Su, Zhenjian Zhuo, Jiao Gong and Yutian ChongBackground: Despite unprecedented scientific progress that has been achieved over the years, there is no established microRNA-based model for predicting hepatocellular carcinoma (HCC) metastasis. To this end, we aimed to develop a simple model based on the expression of miRNAs to identify patients at high risk of metastatic HCC. Methods: HCC datasets with metastasis data were acquired from the Gene Expression Omnibus (GEO) database, and samples were randomly divided into training (n=169) and validation (n=72) groups. Based on the expression of miRNAs in the training group, we developed a predictive nomogram for metastatic HCC. We evaluated its performance using the area under the receiver operating characteristic curve (AUC), calibration curve, decision curve and clinical impact curve analysis. Results: By applying the absolute shrinkage and selection operator regression (LASSO) and multivariate logistic regression, it has been found that the expressions of miR-30c, miR-185, and miR-323 were independent predictors of metastatic HCC. These miRNAs were used to construct a nomogram that yielded good performance in predicting metastasis in training (AUC 0.869 [95% CI 813-0.925], sensitivity 92.7%, specificity 57.8%) and validation groups (0.821 [CI 0.720-0.923], sensitivity 94.7%, specificity 60%). The calibration curve showed a good agreement between actual and predicted outcomes. Decision curve analysis showed a high clinical net benefit of nomogram predictions for our patients. Moreover, higher total scores of our nomogram were associated with dead patients. In addition, functional enrichment analysis showed that the predicted target genes of these 3 miRNAs correlated with tumor metastasis-associated terms, such as filopodium, and identified their related hub genes. Conclusions: Our easy-to-use nomogram could assist in identifying HCC patients at high risk of metastasis, which provides valuable information for clinical treatment.
-
-
-
AthEDL: Identifying Enhancers in Arabidopsis thaliana Using an Attention-based Deep Learning Method
Authors: Yiqiong Chen, Yujia Gao, Hejie Zhou, Yanming Zuo, Youhua Zhang and Zhenyu YueBackground: Enhancers are key cis-function elements of DNA structure that are crucial in gene regulation and the function of a promoter in eukaryotic cells. Availability of accurate identification of the enhancers would facilitate the understanding of DNA functions and their physiological roles. Previous studies have revealed the effectiveness of computational methods for identifying enhancers in other organisms. To date, a huge number of enhancers remain unknown, especially in the field of plant species. Objective: In this study, the aim is to build an efficient attention-based neural network model for the identification of Arabidopsis thaliana enhancers. Methods: A sequence-based model using convolutional and recurrent neural networks was proposed for the identification of enhancers. The input DNA sequences are represented as feature vectors by 4-mer. A neural network model consists of CNN and Bi-RNN as sequence feature extractors, and the attention mechanism is suggested to improve the prediction performance. Results: We implemented an ablation study on validation set to select and evaluate the effectiveness of our proposed model. Moreover, our model showed remarkable performance on the test set achieving the Mcc of 0.955, the AUPRC of 0.638, and the AUROC of 0.837, which are significantly higher than state-of-the-art methods, respectively. Conclusion: The proposed computational framework aims at solving similar problems in non-coding genomic regions, thereby providing valuable insights into the prediction about the enhancers of plants.
-
-
-
Chronological Order Based Wrapper Technique for Drug-Target Interaction Prediction (CO-WT DTI)
Authors: Kavipriya Gananathan, Manjula Dhanabalachandran and Vijayan SugumaranBackground: Drug-Target Interactions (DTIs) are used to suggest new medications for diseases or reuse existing drugs to treat other diseases since experimental procedures take years to complete, and FDA (Food and Drug Administration) permission is necessary for drugs to be made available in the market. Objective: Computational methods are favoured over wet-lab experiments in drug analysis, considering that the process is tedious, time-consuming, and costly. The interactions between drug targets are computationally identified, paving the way for unknown drug-target interactions for numerous diseases unknown to researchers. Methods: This paper presents a Chronological Order-based Wrapper Technique for Drug-Target Interaction prediction (CO-WT DTI) to discover novel DTI. In our proposed approach, drug features, as well as protein features, are obtained by three feature extraction techniques while dimensionality reduction is implemented to remove unfavourable features. The imbalance issue is taken care of by balancing methods while the performance of the proposed approach is validated on benchmark datasets. Results: The proposed approach has been validated using four broadly used benchmark datasets, namely, GPCR (G protein-coupled receptors), enzymes, nuclear receptors, and ion channels. Our experimental results outperform other state-of-the-art methods based on the AUC (area under the Receiver Operating Characteristic (ROC) curve) metric, and Leave-One-Out Cross-Validation (LOOCV) is used to evaluate the prediction performance of the proposed approach. Conclusion: The performance of feature extraction, balancing methods, dimensionality reduction, and classifier suggests ways to contribute data to the development of new drugs. It is anticipated that our model will help refine ensuing explorations, especially in the drug-target interaction domain.
-
Volumes & issues
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)