Volume 18, Issue 7

Current Bioinformatics - Volume 18, Issue 7, 2023

Volume 18, Issue 7, 2023

- Convolutional Neural Networks: A Promising Deep Learning Architecture for Biological Sequence Analysis
  
  Authors: Chinju John, Jayakrushna Sahoo, Manu Madhavan and Oommen K. Mathew
  
  https://doi.org/10.2174/1574893618666230320103421
  More Less
  
  The deep learning arena explores new dimensions once considered impossible to human intelligence. Recently, it has taken footsteps in the biological data world to deal with the diverse patterns of data derived from biomolecules. The convolutional neural networks, one of the most employed and persuasive deep learning architectures, can unravel the sequestered truths from these data, especially from the biological sequences. These neural network variants outperform traditional bioinformatics tools for the enduring tasks associated with such sequences. This work imparts an exciting preface to the basics of convolutional neural network architecture and how it can be instrumented to deal with biological sequence analysis. The approach followed in this paper can provide the reader with an enhanced view of convolutional neural networks, their basic working principles and how they apply to biological sequences. A detailed view of critical steps involved in deep learning, starting from the data preprocessing, architecture designing, model training, hyperparameter tuning, and evaluation metrics, are portrayed. A comparative analysis of convolutional neural network architectures developed for protein family classification is also discussed. This review contributes significantly to understanding the concepts behind deep learning architectures and their applications in biological sequence analysis. It can lift the barrier of limited knowledge to a great extent on the deep learning concepts and their implementation, especially for people who are dealing with pure biology.
  
  Add to my favourites
  
  Email this

- Advances in Peptide/Protein Structure Prediction Tools and their Relevance for Structural Biology in the Last Decade
  
  Authors: Samilla B. Rezende, Lucas R. Lima, Maria L. R. Macedo, Octávio L. Franco and Marlon H. Cardoso
  
  https://doi.org/10.2174/1574893618666230412080702
  More Less
  
  Peptides and proteins are involved in several biological processes at a molecular level. In this context, three-dimensional structure characterization and determination of peptides and proteins have helped researchers unravel the chemical and biological role of these macromolecules. Over 50 years, peptide and protein structures have been determined by experimental methods, including nuclear magnetic resonance (NMR), X-ray crystallography, and cryo-electron microscopy (cryo-EM). Therefore, an increasing number of atomic coordinates for peptides and proteins have been deposited in public databases, thus assisting the development of computational tools for predicting unknown 3D structures. In the last decade, a race for innovative methods has arisen in computational sciences, including more complex biological activity and structure prediction algorithms. As a result, peptide/protein theoretical models have achieved a new level of structure prediction accuracy compared with experimentally determined structures. Machine learning and deep learning approaches, for instance, incorporate fundamental aspects of peptide/protein geometry and include physical/biological knowledge about these macromolecules' experimental structures to build more precise computational models. Additionally, computational strategies have helped structural biology, including comparative, threading, and ab initio modeling and, more recently, prediction tools based on machine learning and deep learning. Bearing this in mind, here we provide a retrospective of protein and peptide structure prediction tools, highlighting their advances and obstacles and how they have assisted researchers in answering crucial biological questions.
  
  Add to my favourites
  
  Email this

- Machine Learning Applications in the Study of Parkinson’s Disease: A Systematic Review
  
  Authors: Jordi Martorell-Marugán, Marco Chierici, Sara Bandres-Ciga, Giuseppe Jurman and Pedro Carmona-Sáez
  
  https://doi.org/10.2174/1574893618666230406085947
  More Less
  
  Background: Parkinson’s disease is a common neurodegenerative disorder that has been studied from multiple perspectives using several data modalities. Given the size and complexity of these data, machine learning emerged as a useful approach to analyze them for different purposes. These methods have been successfully applied in a broad range of applications, including the diagnosis of Parkinson’s disease or the assessment of its severity. In recent years, the number of published articles that used machine learning methodologies to analyze data derived from Parkinson’s disease patients have grown substantially. Objective: Our goal was to perform a comprehensive systematic review of the studies that applied machine learning to Parkinson’s disease data. Methods: We extracted published articles in PubMed, SCOPUS and Web of Science until March 15, 2022. After selection, we included 255 articles in this review. Results: We classified the articles by data type and we summarized their characteristics, such as outcomes of interest, main algorithms, sample size, sources of data and model performance. Conclusion: This review summarizes the main advances in the use of Machine Learning methodologies for the study of Parkinson’s disease, as well as the increasing interest of the research community in this area.
  
  Add to my favourites
  
  Email this

- EpiSemble: A Novel Ensemble-based Machine-learning Framework for Prediction of DNA N6-methyladenine Sites Using Hybrid Features Selection Approach for Crops
  
  Authors: Dipro Sinha, Tanwy Dasmandal, Md Yeasin, Dwijesh C. Mishra, Anil Rai and Sunil Archak
  
  https://doi.org/10.2174/1574893618666230316151648
  More Less
  
  Aim: The study aimed to develop a robust and more precise 6mA methylation prediction tool that assists researchers in studying the epigenetic behaviour of crop plants. Background: N6-methyladenine (6mA) is one of the predominant epigenetic modifications involved in a variety of biological processes in all three kingdoms of life. While in vitro approaches are more precise in detecting epigenetic alterations, they are resource-intensive and time-consuming. Artificial intelligence- based in silico methods have helped overcome these bottlenecks. Methods: A novel machine learning framework was developed through the incorporation of four techniques: ensemble machine learning, hybrid approach for feature selection, the addition of features, such as Average Mutual Information Profile (AMIP), and bootstrap samples. In this study, four different feature sets, namely di-nucleotide frequency, GC content, AMIP, and nucleotide chemical properties were chosen for the vectorization of DNA sequences. Nine machine learning models, including support vector machine, random forest, k-nearest neighbor, artificial neural network, multiple logistic regression, decision tree, naïve Bayes, AdaBoost, and gradient boosting were employed using relevant features extracted through the feature selection module. The top three best-performing models were selected and a robust ensemble model was developed to predict sequences with 6mA sites. Results: EpiSemble, a novel ensemble model was developed for the prediction of 6mA methylation sites. Using the new model, an improvement in accuracy of 7.0%, 3.74%, and 6.65% was achieved over existing models for RiceChen, RiceLv, and Arabidopsis datasets, respectively. An R package, EpiSemble, based on the new model was developed and made available at https://cran.rproject. org/web/packages/EpiSemble/index.html. Conclusion: The EpiSemble model added AMIP as a novel feature, integrated feature selection modules, bootstrapping of samples, and ensemble technique to achieve an improved output for accurate prediction of 6mA sites in plants. To our knowledge, this is the first R package developed for predicting epigenetic sites of genomes in crop plants, which is expected to help plant researchers in their future explorations.
  
  Add to my favourites
  
  Email this

- Survival Prediction of Esophageal Squamous Cell Carcinoma Based on the Prognostic Index and Sparrow Search Algorithm-Support Vector Machine
  
  Authors: Yanfeng Wang, Wenhao Zhang, Yuli Yang, Junwei Sun and Lidong Wang
  
  https://doi.org/10.2174/1574893618666230419084754
  More Less
  
  Aim: Esophageal squamous cell carcinoma (ESCC) is one of the highest incidence and mortality cancers in the world, and recent studies show that the incidence of ESCC is on the rise, and the mortality rate remains high. An effective survival prediction model can assist physicians in treatment decisions and improve the quality of patient survival. Introduction: In this study, ESCC prognostic index and survival prediction model based on blood indicators and TNM staging information are developed, and their effectiveness is analyzed. Methods: Kaplan-Meier survival analysis and COX regression analysis are used to find influencing factors that are significantly associated with patient survival. The binary logistic regression method is utilized to construct a prognostic index (PI) for esophageal squamous cell carcinoma (ESCC). Based on the sparrow search algorithm (SSA) and support vector machine (SVM), a survival prediction model for patients with ESCC is established. Results: Eight factors significantly associated with patient survival are selected by Kaplan-Meier survival analysis and COX regression analysis. PI is divided into four stages, and the stages can reasonably reflect the survival condition of diverse patients. Compared with the other four existing models, the sparrow search algorithm-support vector machine (SSA-SVM) proposed in this paper has higher prediction accuracy. Conclusion: In order to accurately and effectively predict the five-year survival rate of patients with ESCC, a survival prediction model based on Kaplan-Meier survival analysis, COX regression analysis, binary logistic regression and support vector machine is proposed in this paper. The results show that the method proposed in this paper can accurately predict the five-year survival rate of ESCC patients.
  
  Add to my favourites
  
  Email this

- Predicting Herb-disease Associations Through Graph Convolutional Network
  
  Authors: Xuan Hu, You Lu, Geng Tian, Pingping Bing, Bing Wang and Binsheng He
  
  https://doi.org/10.2174/1574893618666230504143647
  More Less
  
  Background: In recent years, herbs have become very popular worldwide as a form of complementary and alternative medicine (CAM). However, there are many types of herbs and diseases, whose associations are impossible to be fully revealed. Identifying new therapeutic indications of herbs, that is drug repositioning, is a critical supplement for new drug development. Considering that exploring the associations between herbs and diseases by wet-lab techniques is time-consuming and laborious, there is an urgent need for reliable computational methods to fill this gap. In this study, we first preprocessed the herbs and their indications in the TCM-Suit database, a comprehensive, accurate, and integrated traditional Chinese medicine database, to obtain the herb-disease association network. We then proposed a novel model based on a graph convolution network (GCN) to infer potential new associations between herbs and diseases. Methods: In our method, the effective features of herbs and diseases were extracted through multi-layer GCN, then the layer attention mechanism was introduced to combine the features learned from multiple GCN layers, and jump connections were added to reduce the over-smoothing phenomenon caused by multi-layer GCN stacking. Finally, the recovered herb-disease association network was generated by the bilinear decoder. We applied our model together with four other methods (including SCMFDD, BNNR, LRMCMDA, and DRHGCN) to predict herb-disease associations. Compared with all other methods, our model showed the highest area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), as well as the highest recall in the five-fold cross-validation. Conclusion: We further used our model to predict the candidate herbs for Alzheimer's disease and found the compounds mediating herbs and diseases through the herb-compound-gene-disease network. The relevant literature also confirmed our findings.
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 18, Issue 7, 2023

Volume 18, Issue 7, 2023

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed