- Home
- A-Z Publications
- Current Bioinformatics
- Previous Issues
- Volume 15, Issue 1, 2020
Current Bioinformatics - Volume 15, Issue 1, 2020
Volume 15, Issue 1, 2020
-
-
Computational Approaches for Transcriptome Assembly Based on Sequencing Technologies
Authors: Yuwen Luo, Xingyu Liao, Fang-Xiang Wu and Jianxin WangTranscriptome assembly plays a critical role in studying biological properties and examining the expression levels of genomes in specific cells. It is also the basis of many downstream analyses. With the increase of speed and the decrease in cost, massive sequencing data continues to accumulate. A large number of assembly strategies based on different computational methods and experiments have been developed. How to efficiently perform transcriptome assembly with high sensitivity and accuracy becomes a key issue. In this work, the issues with transcriptome assembly are explored based on different sequencing technologies. Specifically, transcriptome assemblies with next-generation sequencing reads are divided into reference-based assemblies and de novo assemblies. The examples of different species are used to illustrate that long reads produced by the third-generation sequencing technologies can cover fulllength transcripts without assemblies. In addition, different transcriptome assemblies using the Hybrid-seq methods and other tools are also summarized. Finally, we discuss the future directions of transcriptome assemblies.
-
-
-
Analysis of Germin-like Protein Genes (OsGLPs) Family in Rice Using Various In silicoApproaches
Authors: Muhammad Ilyas, Muhammad Irfan, Tariq Mahmood, Hazrat Hussain, Latif-ur-Rehman, Ijaz Naeem and Khaliq-ur-RahmanBackground: Germin-like Proteins (GLPs) play an important role in various stresses. Rice contains 43 GLPs, among which many remain functionally unexplored. The computational analysis will provide significant insight into their function. Objective: To find various structural properties, functional importance, phylogeny and expression pattern of all OsGLPs using various bioinformatics tools. Methods: Physiochemical properties, sub-cellular localization, domain composition, Nglycosylation and Phosphorylation sites, and 3D structural models of the OsGLPs were predicted using various bioinformatics tools. Functional analysis was carried out with the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and Blast2GO servers. The expression profile of the OsGLPs was predicted by retrieving the data for expression values from tissuespecific and hormonal stressed array libraries of RiceXPro. Their phylogenetic relationship was computed using Molecular and Evolutionary Genetic Analysis (MEGA6) tool. Results: Most of the OsGLPs are stable in the cellular environment with a prominent expression in the extracellular region (57%) and plasma membrane (33%). Besides, 3 basic cupin domains, 7 more were reported, among which NTTNKVGSNVTLINV, FLLAALLALASWQAI, and MASSSF were common to 99% of the sequences, related to bacterial pathogenicity, peroxidase activity, and peptide signal activity, respectively. Structurally, OsGLPs are similar but functionally they are diverse with novel enzymatic activities of oxalate decarboxylase, lyase, peroxidase, and oxidoreductase. Expression analysis revealed prominent activities in the root, endosperm, and leaves. OsGLPs were strongly expressed by abscisic acid, auxin, gibberellin, cytokinin, and brassinosteroid. Phylogenetically they showed polyphyletic origin with a narrow genetic background of 0.05%. OsGLPs of chromosome 3, 8, and 12 are functionally more important due to their defensive role against various stresses through co-expression strategy. Conclusion: The analysis will help to utilize OsGLPs in future food programs.
-
-
-
ESDA: An Improved Approach to Accurately Identify Human snoRNAs for Precision Cancer Therapy
Authors: Yan-mei Dong, Jia-hao Bi, Qi-en He and Kai SongBackground: SnoRNAs (Small nucleolar RNAs) are small RNA molecules with approximately 60-300 nucleotides in sequence length. They have been proved to play important roles in cancer occurrence and progression. It is of great clinical importance to identify new snoRNAs as fast and accurately as possible. Objective: A novel algorithm, ESDA (Elastically Sparse Partial Least Squares Discriminant Analysis), was proposed to improve the speed and the performance of recognizing snoRNAs from other RNAs in human genomes. Methods: In ESDA algorithm, to optimize the extracted information, kernel features were selected from the variables extracted from both primary sequences and secondary structures. Then they were used by SPLSDA (sparse partial least squares discriminant analysis) algorithm as input variables for the final classification model training to distinguish snoRNA sequences from other Human RNAs. Due to the fact that no prior biological knowledge is request to optimize the classification model, ESDA is a very practical method especially for completely new sequences. Results: 89 H/ACA snoRNAs and 269 C/D snoRNAs of human were used as positive samples and 3403 non-snoRNAs as negative samples to test the identification performance of the proposed ESDA. For the H/ACA snoRNAs identification, the sensitivity and specificity were respectively as high as 99.6% and 98.8%. For C/D snoRNAs, they were respectively 96.1% and 98.3%. Furthermore, we compared ESDA with other widely used algorithms and classifiers: SnoReport, RF (Random Forest), DWD (Distance Weighted Discrimination) and SVM (Support Vector Machine). The highest improvement of accuracy obtained by ESDA was 25.1%. Conclusion: Strongly proved the superiority performance of ESDA and make it promising for identifying SnoRNAs for further development of the precision medicine for cancers.
-
-
-
Integration and Querying of Heterogeneous Omics Semantic Annotations for Biomedical and Biomolecular Knowledge Discovery
Authors: Omer Irshad and Muhammad U. G. KhanBackground: Exploring various functional aspects of a biological cell system has been a focused research trend for last many decades. Biologists, scientists and researchers are continuously striving for unveiling the mysteries of these functional aspects to improve the health standards of life. For getting such understanding, astronomically growing, heterogeneous and geographically dispersed omics data needs to be critically analyzed. Currently, omics data is available in different types and formats through various data access interfaces. Applications which require offline and integrated data encounter a lot of data heterogeneity and global dispersion issues. Objective: For facilitating especially such applications, heterogeneous data must be collected, integrated and warehoused in such a loosely coupled way so that each molecular entity can computationally be understood independently or in association with other entities within or across the various cellular aspects. Methods: In this paper, we propose an omics data integration schema and its corresponding data warehouse system for integrating, warehousing and presenting heterogeneous and geographically dispersed omics entities according to the cellular functional aspects. Results & Conclusion: Such aspect-oriented data integration, warehousing and data access interfacing through graphical search, web services and application programing interfaces make our proposed integrated data schema and warehouse system better and useful than other contemporary ones.
-
-
-
MSIT: Malonylation Sites Identification Tree
Authors: Wenzheng Bao, De-Shuang Huang and Yue-Hui ChenAims: Post-Translational Modifications (PTMs), which include more than 450 types, can be regarded as the fundamental cellular regulation. Background: Recently, experiments demonstrated that the lysine malonylation modification is a significant process in several organisms and cells. Meanwhile, malonylation plays an important role in the regulation of protein subcellular localization, stability, translocation to lipid rafts and many other protein functions. Objective: Identification of malonylation will contribute to understanding the molecular mechanism in the field of biology. Nevertheless, several existing experimental approaches, which can hardly meet the need of the high speed data generation, are expensive and time-consuming. Moreover, some machine learning methods can hardly meet the high-accuracy need in this issue. Methods: In this study, we proposed a method, named MSIT that means malonylation sites identification tree, utilized the amino acid residues and profile information to identify the lysine malonylation sites with the tree structural neural network in the peptides sequence level. Results: The proposed algorithm can get 0.8699 of F1 score and 89.34% in true positive ratio in E. coli. MSIT outperformed existing malonylation site identification methods and features on different species datasets. Conclusion: Based on these measures, it can be demonstrated that MSIT will be helpful in identifying candidate malonylation sites.
-
-
-
Predicting Drug-target Interactions via FM-DNN Learning
Authors: Jihong Wang, Hao Wang, Xiaodan Wang and Huiyou ChangBackground: Identifying Drug-Target Interactions (DTIs) is a major challenge for current drug discovery and drug repositioning. Compared to traditional experimental approaches, in silico methods are fast and inexpensive. With the increase in open-access experimental data, numerous computational methods have been applied to predict DTIs. Methods: In this study, we propose an end-to-end learning model of Factorization Machine and Deep Neural Network (FM-DNN), which emphasizes both low-order (first or second order) and high-order (higher than second order) feature interactions without any feature engineering other than raw features. This approach combines the power of FM and DNN learning for feature learning in a new neural network architecture. Results: The experimental DTI basic features include drug characteristics (609), target characteristics (1819), plus drug ID, target ID, total 2430. We compare 8 models such as SVM, GBDT, WIDE-DEEP etc, the FM-DNN algorithm model obtains the best results of AUC(0.8866) and AUPR(0.8281). Conclusion: Feature engineering is a job that requires expert knowledge, it is often difficult and time-consuming to achieve good results. FM-DNN can auto learn a lower-order expression by FM and a high-order expression by DNN.FM-DNN model has outstanding advantages over other commonly used models.
-
-
-
The Expression Profiles of lncRNAs and Their Regulatory Network During Smek1/2 Knockout Mouse Neural Stem Cells Differentiation
Authors: Qichang Yang, Jing Wu, Jian Zhao, Tianyi Xu, Ping Han and Xiaofeng SongBackground: Previous studies indicated that the cell fate of neural stem cells (NSCs) after differentiation is determined by Smek1, one isoform of suppressor of Mek null (Smek). Smek deficiency prevents NSCs from differentiation, thus affects the development of nervous system. In recent years, lncRNAs have been found to participate in numerous developmental and biological pathways. However, the effects of knocking out Smek on the expression profiles of lncRNAs during the differentiation remain unknown. Objective: This study is to explore the expression profiles of lncRNAs and their possible function during the differentiation from Smek1/2 knockout NSCs. Methods: We obtained NSCs from the C57BL/6J mouse fetal cerebral cortex. One group of NSCs was from wildtype mouse (WT group), while another group was from knocked out Smek1/2 (KO group). Results: By analyzing the RNA-Seq data, we found that after knocking out Smek1/2, the expression profiles of mRNAs and lncRNAs revealed significant changes. Analyses indicated that these affected mRNAs have connections with the pathway network for the differentiation and proliferation of NSCs. Furthermore, we performed a co-expression network analysis on the differentially expressed mRNAs and lncRNAs, which helped reveal the possible regulatory rules of lncRNAs during the differentiation after knocking out Smek1/2. Conclusion: By comparing group WT with KO, we found 366 differentially expressed mRNAs and 12 lncRNAs. GO and KEGG enrichment analysis on these mRNAs suggested their relationships with differentiation and proliferation of NSCs. Some of these mRNAs and lncRNAs have been verified to play regulatory roles in nervous system. Analyses on the co-expression network also indicated the possible functions of affected mRNAs and lncRNAs during NSCs differentiation after knocking out Smek1/2.
-
Volumes & issues
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)