An Alignment-Free Method for Classification of Protein Sequences

Sandeep Deshmukh; Sanjeet Khaitan; Debasish Das; Manish Gupta; Pramod P. Wangikar

doi:10.2174/092986607781483804

ISSN: 0929-8665
E-ISSN: 1875-5305

An Alignment-Free Method for Classification of Protein Sequences
By Sandeep Deshmukh, Sanjeet Khaitan, Debasish Das, Manish Gupta and Pramod P. Wangikar
Source: Protein and Peptide Letters, Volume 14, Issue 7, Jul 2007, p. 647 - 657
DOI: https://doi.org/10.2174/092986607781483804
- Available online: 01 Jul 2007

Abstract

Protein sequences vary in their length and are not readily amenable to conventional data mining techniques that need mapping in a fixed dimensional space. Thus, majority of the current methods for protein sequence classification are based on alignment of the query sequence either with a sequence or a profile of the sequence family. We present a method for mapping of protein sequences in a fixed dimensional descriptor space. The descriptors such as amino acid content and amino acid pair association rules were used along with routinely available classification methods. An experiment on one hundred Pfam families showed classification accuracy of 98% with support vector machines classifier. Information gain based feature selection helped simplify the model and improve accuracy. Interestingly, a large number of the selected features were based on the association rules of Glycine or Aspartic acid residues suggesting their role in the conserved loops among evolutionarily related proteins. Further, in another experiment, the approach was tested for classification of proteins from 39 Pfam families of protein kinases. Support vector machines classifier provided an accuracy of approximately 96%. The method provides an alternative to conventional profile based methods for protein sequence classification.

Article metrics loading...

/content/journals/ppl/10.2174/092986607781483804

2007-07-01

2025-06-09

From This Site

/content/journals/ppl/10.2174/092986607781483804

dcterms_title,dcterms_subject,pub_keyword

-contentType:Contributor -contentType:Concept -contentType:Institution

10

5

Full text loading...

/content/journals/ppl/10.2174/092986607781483804

Article Type: Research Article

Keyword(s): Alignment free classification; amino acid association rules; dipeptide frequency; protein sequence classification; remote homology detection

An Alignment-Free Method for Classification of Protein Sequences

Abstract

From This Site

Most Read This Month

Most Cited Most Cited RSS feed

The LL-37 Antimicrobial Peptide as a Treatment for Systematic Infection of Acinetobacter baumannii in a Mouse Model

FN1 Promotes Thyroid Carcinoma Cell Proliferation and Metastasis by Activating the NF-b Pathway

Wogonin Restrains the Malignant Progression of Lung Cancer Through Modulating MMP1 and PI3K/AKT Signaling Pathway

A Temporin Derived Peptide Showing Antibacterial and Antibiofilm Activities against Staphylococcus aureus

Upregulation of Connexins in the Rat Hippocampal and Cortical Neurons Following Blockade of NMDA Receptors During Postnatal Development

Long Non-coding RNA LINC00473 Promotes Breast Cancer Progression via miR-424-5p/CCNE1 Pathway

Protease Inhibitors (PIs): Candidate Molecules for Crop Protection Formulations against Necrotrophs

Preparation of IgE Antibody and Distribution of IgE⁺ Secretory Cells in the Palatine Tonsil of Bactrian Camel

Different VH3-binding Protein A Resins Show Comparable VH3-binding Mediated Byproduct Separation Capabilities Despite Having Varied Dynamic Binding Capacities Towards A VH3 Fab

Immunogenicity and Neutralization Potential of Recombinant Chimeric Protein Comprising the Catalytic Region of Gp63 of Leishmania and LTB against Leishmania donovani