Volume 15, Issue 4

Current Bioinformatics - Volume 15, Issue 4, 2020

Volume 15, Issue 4, 2020

- Meet Our Editorial Board Member
  
  By Bing Niu
  
  https://doi.org/10.2174/157489361504200406092656
  More Less
  
  Add to my favourites
  
  Email this

- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
  
  https://doi.org/10.2174/1574893615666191219094216
  More Less
  
  Molecular Docking is used to positioning the computer-generated 3D structure of small ligands into a receptor structure in a variety of orientations, conformations and positions. This method is useful in drug discovery and medicinal chemistry providing insights into molecular recognition. Docking has become an integral part of Computer-Aided Drug Design and Discovery (CADDD). Traditional docking methods suffer from limitations of semi-flexible or static treatment of targets and ligand. Over the last decade, advances in the field of computational, proteomics and genomics have also led to the development of different docking methods which incorporate protein-ligand flexibility and their different binding conformations. Receptor flexibility accounts for more accurate binding pose predictions and a more rational depiction of protein binding interactions with the ligand. Protein flexibility has been included by generating protein ensembles or by dynamic docking methods. Dynamic docking considers solvation, entropic effects and also fully explores the drug-receptor binding and recognition from both energetic and mechanistic point of view. Though in the fast-paced drug discovery program, dynamic docking is computationally expensive but is being progressively used for screening of large compound libraries to identify the potential drugs. In this review, a quick introduction is presented to the available docking methods and their application and limitations in drug discovery.
  
  Add to my favourites
  
  Email this

- Analysis and Comparison of RNA Pseudouridine Site Prediction Tools
  
  Authors: Wei Chen and Kewei Liu
  
  https://doi.org/10.2174/1574893614666191018171521
  More Less
  
  Background: Pseudouridine (Ψ) is the most abundant RNA modification and has important functions in a series of biological and cellular processes. Although experimental techniques have made great contributions to identify Ψ sites, they are still labor-intensive and costineffective. In the past few years, a series of computational approaches have been developed, which provided rapid and efficient approaches to identify Ψ sites. Results: To provide the readership with a clear landscape about the recent development in this important area, in this review, we summarized and compared the representative computational approaches developed for identifying Ψ sites. Moreover, future directions in computationally identifying Ψ sites were discussed as well. Conclusion: We anticipate that this review will provide novel insights into the researches on pseudouridine modification.
  
  Add to my favourites
  
  Email this

- Finding Community of Brain Networks Based on Neighbor Index and DPSO with Dynamic Crossover
  
  Authors: Jie Zhang, Junhong Feng and Fang-Xiang Wu
  
  https://doi.org/10.2174/1574893614666191017100657
  More Less
  
  Background: The brain networks can provide us an effective way to analyze brain function and brain disease detection. In brain networks, there exist some import neural unit modules, which contain meaningful biological insights. Objective: Therefore, we need to find the optimal neural unit modules effectively and efficiently. Method: In this study, we propose a novel algorithm to find community modules of brain networks by combining Neighbor Index and Discrete Particle Swarm Optimization (DPSO) with dynamic crossover, abbreviated as NIDPSO. The differences between this study and the existing ones lie in that NIDPSO is proposed first to find community modules of brain networks, and dose not need to predefine and preestimate the number of communities in advance. Results: We generate a neighbor index table to alleviate and eliminate ineffective searches and design a novel coding by which we can determine the community without computing the distances amongst vertices in brain networks. Furthermore, dynamic crossover and mutation operators are designed to modify NIDPSO so as to alleviate the drawback of premature convergence in DPSO. Conclusion: The numerical results performing on several resting-state functional MRI brain networks demonstrate that NIDPSO outperforms or is comparable with other competing methods in terms of modularity, coverage and conductance metrics.
  
  Add to my favourites
  
  Email this

- Predicting Protein Phosphorylation Sites Based on Deep Learning
  
  Authors: Haixia Long, Zhao Sun, Manzhi Li, Hai Y. Fu and Ming Cai Lin
  
  https://doi.org/10.2174/1574893614666190902154332
  More Less
  
  Background: Protein phosphorylation is one of the most important Post-translational Modifications (PTMs) occurring at amino acid residues serine (S), threonine (T), and tyrosine (Y). It plays critical roles in protein structure and function predicting. With the development of novel high-throughput sequencing technologies, there are a huge amount of protein sequences being generated and stored in databases. Objective: It is of great importance in both basic research and drug development to quickly and accurately predict which residues of S, T, or Y can be phosphorylated. Methods: In order to solve the problem, a novel hybrid deep learning model with a convolutional neural network and bi-directional long short-term memory recurrent neural network (CNN+BLSTM) is proposed for predicting phosphorylation sites in proteins. The model contains a list of layers that transform the input data into an output class, in which the convolution layer captures higher-level abstraction features of amino acid, while the recurrent layer captures long-term dependencies between amino acids to improve predictions. The joint model learns interactions between higher-level features derived from the protein sequence to predict the phosphorylated sites. Results: We applied our model together with two canonical methods namely iPhos-PseEn and MusiteDeep. A 5-fold cross-validation process indicated that CNN+BLSTM outperforms the two competitors in various evaluation metrics like the area under the receiver operating characteristic and precision-recall curves, the Matthews correlation coefficient, F-measure, accuracy, and so on. Conclusion: CNN+BLSTM is promising in identifying potential protein phosphorylation for further experimental validation.
  
  Add to my favourites
  
  Email this

- A Sequential Ensemble Model for Communicable Disease Forecasting
  
  Authors: Nashreen Sultana, Nonita Sharma, Krishna P. Sharma and Shobhit Verma
  
  https://doi.org/10.2174/1574893614666191202153824
  More Less
  
  Background: Ensemble building is a popular method for improving model accuracy for classification problems as well as regression. Objective: In this research work, we propose a sequential ensemble model to predict the number of incidences for communicable diseases like influenza, hand foot and mouth disease (HFMD), and diarrhea and compare it with applied models for prediction. Methods: The weekly dataset of the three diseases, namely, influenza, HFMD, and diarrhea, are collected from the official government site of Hong Kong from the year 2010 to 2018. The data was preprocessed by taking log transformation and z-score transformation. The proposed sequential ensemble model is applied to the processed dataset to predict future occurrences. Results: The result of the proposed ensemble model is compared against standard support vector regression (SVR) using different error metrics such as root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE). In the case of all the threedisease datasets, the proposed ensemble model gives better results in comparison to the standard SVR model. Conclusion: The main objective of this research work is to minimize the prediction error; the proposed sequential ensemble model has shown a significant result in terms of prediction errors.
  
  Add to my favourites
  
  Email this

- SimExact – An Efficient Method to Compute Function Similarity Between Proteins Using Gene Ontology
  
  Authors: Najmul Ikram, Muhammad A. Qadir and Muhammad Tanvir Afzal
  
  https://doi.org/10.2174/1574893614666191017092842
  More Less
  
  Background: The rapidly growing protein and annotation databases necessitate the development of efficient tools to process this valuable information. Biologists frequently need to find proteins similar to a given protein, for which BLAST tools are commonly used. With the development of biomedical ontologies, e.g. Gene Ontology, methods were designed to measure function (semantic) similarity between two proteins. These methods work well on protein pairs, but are not suitable for protein query processing. Objective: Our aim is to facilitate searching of similar proteins in an acceptable time. Methods: A novel method SimExact for high speed searching of functionally similar proteins has been proposed. Results: The experiments of this study show that SimExact gives correct results required for protein searching. A fully functional prototype of an online tool (www.datafurnish.com/protsem.php) has been provided that generates a ranked list of the proteins similar to a query protein, with a response time of less than 20 seconds in our setup. SimExact was used to search for protein pairs having high disparity between function similarity and sequence similarity. Conclusion: SimExact makes such searches practical, which would not be possible in a reasonable time otherwise.
  
  Add to my favourites
  
  Email this

- Identification of Novel Key Targets and Candidate Drugs in Oral Squamous Cell Carcinoma
  
  Authors: Juan Liu, Xinjie Lian, Feng Liu, Xueling Yan, Chunyan Cheng, Lijia Cheng, Xiaolin Sun and Zheng Shi
  
  https://doi.org/10.2174/1574893614666191127101836
  More Less
  
  Background: Oral Squamous Cell Carcinoma (OSCC) is the most common malignant epithelial neoplasm. It is located within the top 10 ranking incidence of cancers with a poor prognosis and low survival rates. New breakthroughs of therapeutic strategies are therefore needed to improve the survival rate of OSCC harboring patients. Objective: Since targeted therapy is considered as the most promising therapeutic strategies in cancer, it is of great significance to identify novel targets and drugs for the treatment of OSCC. Methods: A series of bioinformatics approaches were launched to identify the hub proteins and their potential agents. Microarray analysis and several online functional activity network analysis were firstly utilized to recognize drug targets in OSCC. Subsequently, molecular docking was used to screen their potential drugs from the specs chemistry database. At the same time, the assessment of ligand-based virtual screening model was also evaluated. Results: In this study, two microarray data (GSE31056, GSE23558) were firstly selected and analyzed to get consensus candidate genes including 681 candidate genes. Additionally, we selected 33 candidate genes based on whether they belong to the kinases and transcription factors and further clustered candidate hub targets based on functions and signaling pathways with significant enrichment analysis by using DAVID and STRING online databases. Then, core PPI network was then identified and we manually selected GRB2 and IGF1 as the key drug targets according to the network analysis and previous references. Lastly, virtual screening was performed to identify potential small molecules which could target these two targets, and such small molecules can serve as the promising candidate agents for future drug development. Conclusion: In summary, our study might provide novel insights for understanding of the underlying molecular events of OSCC, and our discovered candidate targets and candidate agents could be used as the promising therapeutic strategies for the treatment of OSCC.
  
  Add to my favourites
  
  Email this

- A Novel Integrative Approach for Non-coding RNA Classification Based on Deep Learning
  
  Authors: Abdelbasset Boukelia, Anouar Boucheham, Meriem Belguidoum, Mohamed Batouche, Farida Zehraoui and Fariza Tahi
  
  https://doi.org/10.2174/1574893614666191105160633
  More Less
  
  Background: Molecular biomarkers show new ways to understand many disease processes. Noncoding RNAs as biomarkers play a crucial role in several cellular activities, which are highly correlated to many human diseases especially cancer. The classification and the identification of ncRNAs have become a critical issue due to their application, such as biomarkers in many human diseases. Objective: Most existing computational tools for ncRNA classification are mainly used for classifying only one type of ncRNA. They are based on structural information or specific known features. Furthermore, these tools suffer from a lack of significant and validated features. Therefore, the performance of these methods is not always satisfactory. Methods: We propose a novel approach named imCnC for ncRNA classification based on multisource deep learning, which integrates several data sources such as genomic and epigenomic data to identify several ncRNA types. Also, we propose an optimization technique to visualize the extracted features pattern from the multisource CNN model to measure the epigenomics features of each ncRNA type. Results: The computational results using a dataset of 16 human ncRNA classes downloaded from RFAM show that imCnC outperforms the existing tools. Indeed, imCnC achieved an accuracy of 94,18%. In addition, our method enables to discover new ncRNA features using an optimization technique to measure and visualize the features pattern of the imCnC classifier.
  
  Add to my favourites
  
  Email this

- A Machine Learning-based Diagnosis of Thyroid Cancer Using Thyroid Nodules Ultrasound Images
  
  Authors: Xuesi Ma, Baohang Xi, Yi Zhang, Lijuan Zhu, Xin Sui, Geng Tian and Jialiang Yang
  
  https://doi.org/10.2174/1574893614666191017091959
  More Less
  
  Background: Ultrasound test is one of the routine tests for the diagnosis of thyroid cancer. The diagnosis accuracy depends largely on the correct interpretation of ultrasound images of thyroid nodules. However, human eye-based image recognition is usually subjective and sometimes error-prone especially for less experienced doctors, which presents a need for computeraided diagnostic systems. Objective: To our best knowledge, there is no well-maintained ultrasound image database for the Chinese population. In addition, though there are several computational methods for image-based thyroid cancer detection, a comparison among them is missing. Finally, the effects of features like the choice of distance measures have not been assessed. The study aims to give the improvement of these limitations and proposes a highly accurate image-based thyroid cancer diagnosis system, which can better assist doctors in the diagnosis of thyroid cancer. Methods: We first establish a novel thyroid nodule ultrasound image database consisting of 508 images collected from the Third Hospital of Hebei Medical University in China. The clinical information for the patients is also collected from the hospital, where 415 patients are diagnosed to be benign and 93 are malignant by doctors following a standard diagnosis procedure. We develop and apply five machine learning methods to the dataset including deep neural network, support vector machine, the center clustering method, k-nearest neighbor, and logistic regression. Results: Experimental results show that deep neural network outperforms other diagnosis methods with an average cross-validation accuracy of 0.87 in 10 runs. Meanwhile, we also explore the performance of four image distance measures including the Euclidean distance, the Manhattan distance, the Chebyshev distance, and the Minkowski distance, among which the Chebyshev distance is the best. The resource can be directly used to aid doctors in thyroid cancer diagnosis and treatment. Conclusions: The paper establishes a novel thyroid nodule ultrasound image database and develops a high accurate image-based thyroid cancer diagnosis system which can better assist doctors in the diagnosis of thyroid cancer.
  
  Add to my favourites
  
  Email this

- Application of a Deep Matrix Factorization Model on Integrated Gene Expression Data
  
  Authors: Yong-Jing Hao, Mi-Xiao Hou, Ying-Lian Gao, Jin-Xing Liu and Xiang-Zhen Kong
  
  https://doi.org/10.2174/1574893614666191017094331
  More Less
  
  Background: Non-negative Matrix Factorization (NMF) has been extensively used in gene expression data. However, most NMF-based methods have single-layer structures, which may achieve poor performance for complex data. Deep learning, with its carefully designed hierarchical structure, has shown significant advantages in learning data features. Objective: In bioinformatics, on the one hand, to discover differentially expressed genes in gene expression data; on the other hand, to obtain higher sample clustering results. It can provide the reference value for the prevention and treatment of cancer. Method: In this paper, we apply a deep NMF method called Deep Semi-NMF on the integrated gene expression data. In each layer, the coefficient matrix is directly decomposed into the basic and coefficient matrix of the next layer. We apply this factorization model on The Cancer Genome Atlas (TCGA) genomic data. Results: The experimental results demonstrate the superiority of Deep Semi-NMF method in identifying differentially expressed genes and clustering samples. Conclusion: The Deep Semi-NMF model decomposes a matrix into multiple matrices and multiplies them to form a matrix. It can also improve the clustering performance of samples while digging out more accurate key genes for disease treatment.
  
  Add to my favourites
  
  Email this

- ConvsPPIS: Identifying Protein-protein Interaction Sites by an Ensemble Convolutional Neural Network with Feature Graph
  
  Authors: Huaixu Zhu, Xiuquan Du and Yu Yao
  
  https://doi.org/10.2174/1574893614666191105155713
  More Less
  
  Background/Objective: Protein-protein interactions are essentials for most cellular processes and thus, unveiling how proteins interact with is a crucial question that can be better understood by recognizing which residues participate in the interaction. Although many computational approaches have been proposed to predict interface residues, their feature perspective and model learning ability are not enough to achieve ideal results. So, our objective is to improve the predictive performance under considering feature perspective and new learning algorithm. Method: In this study, we proposed an ensemble deep convolutional neural network, which explores the context and positional context of consecutive residues within a protein sub-sequence. Specifically, unlike the feature view of previous methods, ConvsPPIS uses evolutionary, physicochemical, and structural protein characteristics to construct their own feature graph respectively. After that, three independent deep convolutional neural networks are trained on each type of feature graph for learning the underlying pattern in sub-sequence. Lastly, we integrated those three deep networks into an ensemble predictor with leveraging complementary information of those features to predict potential interface residues. Results: Some comparative experiments have conducted through 10-fold cross-validation. The results indicated that ConvsPPIS achieved superior performance on DBv5-Sel dataset with an accuracy of 88%. Additional experiments on CAPRI-Alone dataset demonstrated ConvsPPIS has also better prediction performance. Conclusion: The ConvsPPIS method provided a new perspective to capture protein feature expression for identifying protein-protein interaction sites. The results proved the superiority of this method.
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 15, Issue 4, 2020

Volume 15, Issue 4, 2020

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed