Full text loading...
-
A Method of Enhancing Heterogeneous Graph Representation for Predicting the Associations between lncRNAs and Diseases
-
-
- 12 May 2024
- 05 Aug 2024
- 06 Nov 2024
Abstract
Long non-coding RNAs (lncRNAs) are a category of more extended RNA strands that lack protein-coding abilities. Although they are not involved in the translation of proteins, studies have shown that they play essential regulatory functions in cells, regulating gene expression and cell biological processes. However, it is both costly and inefficient to determine the associations between lncRNAs and diseases through biological experiments. Therefore, there is an urgent need to develop convenient and fast computational methods to predict lncRNA-disease associations (LDAs) more efficiently.
Predicting disease-associated lncRNAs can help explore the mechanisms of action of lncRNAs in diseases, and this is crucial for early intervention and treatment of diseases.
In this paper, we propose an enhanced heterogeneous graph representation method for predicting LDAs, named GCGALDA. The GCGALDA first obtains the topological structure features of nodes by a biased random walk. Based on this, the neighboring nodes of a node are weighted using the attention mechanism to further mine the semantic association relationships between nodes in the graph data. Then, a graph convolution network (GCN) is used to transfer the neighborhood features of the node to the central node and combine them with the node's features so that the final node representation contains not only structural information but also semantic association information. Finally, the association score between lncRNA and disease is obtained by multilayer perceptron (MLP).
As evidenced by the experimental findings, the GCGALDA outperforms other advanced models in terms of prediction accuracy on openly accessible databases. In addition, case studies on several human diseases further confirm the predictive ability of the GCGALDA.
In conclusion, the proposed GCGALDA model extracts multi-perspective features, such as topology, semantic association, and node attributes, obtains high-quality heterogeneous graph node representations, and effectively improves the performance of the LDA prediction model.