Comparing Global and Local Likelihood Score Thresholds in Multiclass Laplacian-Modified Naive Bayes Protein Target Prediction

Georgios Drakakis; Alexios Koutsoukas; Suzanne C. Brewerton; Michael J. Bodkin; David A. Evans; Andreas Bender

doi:10.2174/1386207318666150305145012

ISSN: 1386-2073
E-ISSN: 1875-5402

Comparing Global and Local Likelihood Score Thresholds in Multiclass Laplacian-Modified Naive Bayes Protein Target Prediction
By Georgios Drakakis, Alexios Koutsoukas, Suzanne C. Brewerton, Michael J. Bodkin, David A. Evans and Andreas Bender
Source: Combinatorial Chemistry & High Throughput Screening, Volume 18, Issue 3, Jan 2015, p. 323 - 330
DOI: https://doi.org/10.2174/1386207318666150305145012
- Available online: 01 Jan 2015

Abstract

The increase of publicly available bioactivity data has led to the extensive development and usage of in silico bioactivity prediction algorithms. A particularly popular approach for such analyses is the multiclass Naïve Bayes, whose output is commonly processed by applying empirically-derived likelihood score thresholds. In this work, we describe a systematic way for deriving score cut-offs on a per-protein target basis and compare their performance with global thresholds on a large scale using both 5-fold cross-validation (ChEMBL 14, 189k ligand-protein pairs over 477 protein targets) and external validation (WOMBAT, 63k pairs, 421 targets). The individual protein target cut-offs derived were compared to global cut-offs ranging from -10 to 40 in score bouts of 2.5. The results indicate that individual thresholds had equal or better performance in all comparisons with global thresholds, ranging from 95% of protein targets to 57.96%. It is shown that local thresholds behave differently for particular families of targets (CYPs, GPCRs, Kinases and TFs). Furthermore, we demonstrate the discrepancy in performance when we move away from the training dataset chemical space, using Tanimoto similarity as a metric (from 0 to 1 in steps of 0.2). Finally, the individual protein score cut-offs derived for the in silico bioactivity application used in this work are released, as well as the reproducible and transferable KNIME workflows used to carry out the analysis.

Article metrics loading...

/content/journals/cchts/10.2174/1386207318666150305145012

2015-01-01

2026-02-14

From This Site

/content/journals/cchts/10.2174/1386207318666150305145012

dcterms_title,dcterms_subject,pub_keyword

-contentType:Contributor -contentType:Concept -contentType:Institution

10

5

Full text loading...

/content/journals/cchts/10.2174/1386207318666150305145012

Article Type: Research Article

Keyword(s): Cheminformatics; in silico bioactivity prediction; likelihood score thresholds

Comparing Global and Local Likelihood Score Thresholds in Multiclass Laplacian-Modified Naive Bayes Protein Target Prediction

Abstract

From This Site

Most Read This Month

Most Cited Most Cited RSS feed

Privileged Structures: Applications in Drug Discovery

Computational Methods in Developing Quantitative Structure-Activity Relationships (QSAR): A Review

Recent Advances on Potentiometric Membrane Sensors for Pharmaceutical Analysis

Label-Free Detection of Biomolecular Interactions Using BioLayer Interferometry for Kinetic Characterization

Metalloproteinase Inhibitors for the Disintegrin-Like Metalloproteinases ADAM10 and ADAM17 that Differentially Block Constitutive and Phorbol Ester-Inducible Shedding of Cell Surface Molecules

On Various Metrics Used for Validation of Predictive QSAR Models with Applications in Virtual Screening and Focused Library Design

Diversity Among Microbial Cyclic Lipopeptides: Iturins and Surfactins. Activity-Structure Relationships to Design New Bioactive Agents

Building a Tiered Approach to In Vitro Predictive Toxicity Screening: A Focus on Assays with In Vivo Relevance

Antioxidants and Inflammatory Disease: Synthetic and Natural Antioxidants with Anti-Inflammatory Activity

Machine Learning in Virtual Screening