Volume 19, Issue 10

Current Bioinformatics - Volume 19, Issue 10, 2024

Volume 19, Issue 10, 2024

Life Sciences, Systems Biology & Bioinformatics, Biochemical Research Methods, Mathematical & Computational Biology
- - Advances in Deep Learning Assisted Drug Discovery Methods: A Self-review
    
    Authors: Haiping Zhang and Konda Mani Saravanan
    
    https://doi.org/10.2174/0115748936285690240101041704
    More Less
    
    Artificial Intelligence is a field within computer science that endeavors to replicate the intricate structures and operational mechanisms inherent in the human brain. Machine learning is a subfield of artificial intelligence that focuses on developing models by analyzing training data. Deep learning is a distinct subfield within artificial intelligence, characterized by using models that depict geometric transformations across multiple layers. The deep learning has shown significant promise in various domains, including health and life sciences. In recent times, deep learning has demonstrated successful applications in drug discovery. In this self-review, we present recent methods developed with the aid of deep learning. The objective is to give a brief overview of the present cutting-edge advancements in drug discovery from our group. We have systematically discussed experimental evidence and proof of concept examples for the deep learning-based models developed, such as DeepBindBC, DeepPep, and DeepBindRG. These developments not only shed light on the existing challenges but also emphasize the achievements and prospects for future drug discovery and development progress.
    
    Add to my favourites
    
    Email this

- A-RFP: An Adaptive Residue Flexibility Prediction Method Improving Protein-ligand Docking Based on Homologous Proteins
  
  Authors: Chuqi Lei, Senbiao Fang, Yaohang Li, Fei Guo and Min Li
  
  https://doi.org/10.2174/0115748936258790240101062642
  More Less
  
  Background
  Computational molecular docking plays an important role in determining the precise receptor-ligand conformation, which becomes a powerful tool for drug discovery. In the past 30 years, most computational docking methods have treated the receptor structure as a rigid body, although flexible docking often yields higher accuracy. The main disadvantage of flexible docking is its significantly higher computational cost. Due to the fact that different protein pocket residues exhibit different degrees of flexibility, semi-flexible docking methods, balancing rigid docking and flexible docking, have demonstrated success in predicting highly accurate conformations with a relatively low computational cost.
  Methods
  In our study, the number of flexible pocket residues was assessed by quantitative analysis, and a novel adaptive residue flexibility prediction method, named A-RFP, was proposed to improve the docking performance. Based on the homologous information, a joint strategy is used to predict the pocket residue flexibility by combining RMSD, the distance between the residue sidechain and the ligand, and the sidechain orientation. For each receptor-ligand pair, A-RFP provides a docking conformation with the optimal affinity.
  Results
  By analyzing the docking affinities of 3507 target-ligand pairs in 5 different values ranging from 0 to 10, we found there is a general trend that the larger number of flexible residues inevitably improves the docking results by using Autodock Vina. However, a certain number of counterexamples still exist. To validate the effectiveness of A-RFP, the experimental assessment was tested in a small-scale virtual screening on 5 proteins, which confirmed that A-RFP could enhance the docking performance. And the flexible-receptor virtual screening on a low-similarity dataset with 85 receptors validates the accuracy of residue flexibility comprehensive evaluation. Moreover, we studied three receptors with FDA-approved drugs, which further proved A-RFP can play a suitable role in ligand discovery.
  Conclusion
  Our analysis confirms that the screening performance of the various numbers of flexible residues varies wildly across receptors. It suggests that a fine-grained docking method would offset the aforementioned deficiency. Thus, we presented A-RFP, an adaptive pocket residue flexibility prediction method based on homologous information. Without considering computational resources and time costs, A-RFP provides the optimal docking result.
  
  Add to my favourites
  
  Email this

- STNMDA: A Novel Model for Predicting Potential Microbe-Drug Associations with Structure-Aware Transformer
  
  Authors: Liu Fan, Xiaoyu Yang, LeiWang and Xianyou Zhu
  
  https://doi.org/10.2174/0115748936272939231212102627
  More Less
  
  Introduction
  Microbes are intimately involved in the physiological and pathological processes of numerous diseases. There is a critical need for new drugs to combat microbe-induced diseases in clinical settings. Predicting potential microbe-drug associations is, therefore, essential for both disease treatment and novel drug discovery. However, it is costly and time-consuming to verify these relationships through traditional wet lab approaches.
  Methods
  We proposed an efficient computational model, STNMDA, that integrated a Structure-Aware Transformer (SAT) with a Deep Neural Network (DNN) classifier to infer latent microbe-drug associations. The STNMDA began with a “random walk with a restart” approach to construct a heterogeneous network using Gaussian kernel similarity and functional similarity measures for microorganisms and drugs. This heterogeneous network was then fed into the SAT to extract attribute features and graph structures for each drug and microbe node. Finally, the DNN classifier calculated the probability of associations between microbes and drugs.
  Results
  Extensive experimental results showed that STNMDA surpassed existing state-of-the-art models in performance on the MDAD and aBiofilm databases. In addition, the feasibility of STNMDA in confirming associations between microbes and drugs was demonstrated through case validations.
  Conclusion
  Hence, STNMDA showed promise as a valuable tool for future prediction of microbe-drug associations.
  
  Add to my favourites
  
  Email this

- Genotype and Phenotype Association Analysis Based on Multi-omics Statistical Data
  
  Authors: Xinpeng Guo, Yafei Song, Dongyan Xu, Xueping Jin and Xuequn Shang
  
  https://doi.org/10.2174/0115748936276861240109045208
  More Less
  
  Background
  When using clinical data for multi-omics analysis, there are issues such as the insufficient number of omics data types and relatively small sample size due to the protection of patients' privacy, the requirements of data management by various institutions, and the relatively large number of features of each omics data. This paper describes the analysis of multi-omics pathway relationships using statistical data in the absence of clinical data.
  Methods
  We proposed a novel approach to exploit easily accessible statistics in public databases. This approach introduces phenotypic associations that are not included in the clinical data and uses these data to build a three-layer heterogeneous network. To simplify the analysis, we decomposed the three-layer network into double two-layer networks to predict the weights of the inter-layer associations. By adding a hyperparameter β, the weights of the two layers of the network were merged, and then k-fold cross-validation was used to evaluate the accuracy of this method. In calculating the weights of the two-layer networks, the RWR with fixed restart probability was combined with PBMDA and CIPHER to generate the PCRWR with biased weights and improved accuracy.
  Results
  The area under the receiver operating characteristic curve was increased by approximately 7% in the case of the RWR with initial weights.
  Conclusion
  Multi-omics statistical data were used to establish genotype and phenotype correlation networks for analysis, which was similar to the effect of clinical multi-omics analysis.
  
  Add to my favourites
  
  Email this

- Enhancing Drug-Target Binding Affinity Prediction through Deep Learning and Protein Secondary Structure Integration
  
  Authors: Runhua Zhang, Baozhong Zhu, Tengsheng Jiang, Zhiming Cui and Hongjie Wu
  
  https://doi.org/10.2174/0115748936285519240110070209
  More Less
  
  Background
  Conventional approaches to drug discovery are often characterized by lengthy and costly processes. To expedite the discovery of new drugs, the integration of artificial intelligence (AI) in predicting drug-target binding affinity (DTA) has emerged as a crucial approach. Despite the proliferation of deep learning methods for DTA prediction, many of these methods primarily concentrate on the amino acid sequence of proteins. Yet, the interactions between drug compounds and targets occur within distinct segments within the protein structures, whereas the primary sequence primarily captures global protein features. Consequently, it falls short of fully elucidating the intricate relationship between drugs and their respective targets.
  Objective
  This study aims to employ advanced deep-learning techniques to forecast DTA while incorporating information about the secondary structure of proteins.
  Methods
  In our research, both the primary sequence of protein and the secondary structure of protein were leveraged for protein representation. While the primary sequence played the role of the overarching feature, the secondary structure was employed as the localized feature. Convolutional neural networks and graph neural networks were utilized to independently model the intricate features of target proteins and drug compounds. This approach enhanced our ability to capture drug-target interactions more effectively.
  Results
  We have introduced a novel method for predicting DTA. In comparison to DeepDTA, our approach demonstrates significant enhancements, achieving a 3.9% increase in the Concordance Index (CI) and a remarkable 34% reduction in Mean Squared Error (MSE) when evaluated on the KIBA dataset.
  Conclusion
  In conclusion, our results unequivocally demonstrate that augmenting DTA prediction with the inclusion of the protein's secondary structure as a localized feature yields significantly improved accuracy compared to relying solely on the primary structure.
  
  Add to my favourites
  
  Email this

- Sia-m7G: Predicting m7G Sites through the Siamese Neural Network with an Attention Mechanism
  
  Authors: Jia Zheng and Yetong Zhou
  
  https://doi.org/10.2174/0115748936285540240116065719
  More Less
  
  Background
  The chemical modification of RNA plays a crucial role in many biological processes. N7-methylguanosine (m7G), being one of the most important epigenetic modifications, plays an important role in gene expression, processing metabolism, and protein synthesis. Detecting the exact location of m7G sites in the transcriptome is key to understanding their relevant mechanism in gene expression. On the basis of experimentally validated data, several machine learning or deep learning tools have been designed to identify internal m7G sites and have shown advantages over traditional experimental methods in terms of speed, cost-effectiveness and robustness.
  Aims
  In this study, we aim to develop a computational model to help predict the exact location of m7G sites in humans.
  Objective
  Simple and advanced encoding methods and deep learning networks are designed to achieve excellent m7G prediction efficiently.
  Methods
  Three types of feature extractions and six classification algorithms were tested to identify m7G sites. Our final model, named Sia-m7G, adopts one-hot encoding and a delicate Siamese neural network with an attention mechanism. In addition, multiple 10-fold cross-validation tests were conducted to evaluate our predictor.
  Results
  Sia-m7G achieved the highest sensitivity, specificity and accuracy on 10-fold cross-validation tests compared with the other six m7G predictors. Nucleotide preference and model visualization analyses were conducted to strengthen the interpretability of Sia-m7G and provide a further understanding of m7G site fragments in genomic sequences.
  Conclusion
  Sia-m7G has significant advantages over other classifiers and predictors, which proves the superiority of the Siamese neural network algorithm in identifying m7G sites.
  
  Add to my favourites
  
  Email this

- Integrated Machine Learning Algorithms for Stratification of Patients with Bladder Cancer
  
  Authors: Yuanyuan He, Haodong Wei, Siqing Liao, Ruiming Ou, Yuqiang Xiong, Yongchun Zuo and Lei Yang
  
  https://doi.org/10.2174/0115748936288453240124082031
  More Less
  
  Background
  Bladder cancer is a prevalent malignancy globally, characterized by rising incidence and mortality rates. Stratifying bladder cancer patients into different subtypes is crucial for the effective treatment of this form of cancer. Therefore, there is a need to develop a stratification model specific to bladder cancer.
  Purpose
  This study aims to establish a prognostic prediction model for bladder cancer, with the primary goal of accurately predicting prognosis and treatment outcomes.
  Methods
  We collected datasets from 10 bladder cancer datasets sourced from the Gene Expression Omnibus (GEO), the Cancer Genome Atlas (TCGA) databases, and IMvigor210 dataset. The machine learning based on feature selection algorithms were used to generate 96 models for establishing the risk score for each patient. Based on the risk score, all the patients were classified into two different risk score groups.
  Results
  The two groups of bladder cancer patients exhibited significant differences in prognosis, biological functions, and drug sensitivity. Nomogram model demonstrated that the risk score had a robust predictive effect with good clinical utility.
  Conclusion
  The risk score constructed in this study can be utilized to predict the prognosis, response to drug treatment, and immunotherapy of bladder cancer patients, providing assistance for personalized clinical treatment of bladder cancer.
  
  Add to my favourites
  
  Email this

- CFCN: An HLA-peptide Prediction Model based on Taylor Extension Theory and Multi-view Learning
  
  Authors: Bing Rao, Bing Han, Leyi Wei, Zeyu Zhang, Xinbo Jiang and Balachandran Manavalan
  
  https://doi.org/10.2174/0115748936299044240202100019
  More Less
  
  Background
  With the increasing development of biotechnology, many cancer solutions have been proposed nowadays. In recent years, Neo-peptides-based methods have made significant contributions, with an essential prerequisite of bindings between peptides and HLA molecules. However, the binding is hard to predict, and the accuracy is expected to improve further.
  Methods
  Therefore, we propose the Crossed Feature Correction Network (CFCN) with deep learning method, which can automatically extract and adaptively learn the discriminative features in HLA-peptide binding, in order to make more accurate predictions on HLA-peptide binding tasks. With the fancy structure of encoding and feature extracting process for peptides, as well as the feature fusion process between fine-grained and coarse-grained level, it shows many advantages on given tasks.
  Results
  The experiment illustrates that CFCN achieves better performances overall, compared with other fancy models in many aspects.
  Conclusion
  In addition, we also consider to use multi-view learning methods for the feature fusion process, in order to find out further relations among binding features. Eventually, we encapsulate our model as a useful tool for further research on binding tasks.
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 19, Issue 10, 2024

Volume 19, Issue 10, 2024

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed