Decision Trees for Continuous Data and Conditional Mutual Information as a Criterion for Splitting Instances

Georgios Drakakis; Saadiq Moledina; Charalampos Chomenidis; Philip Doganis; Haralambos Sarimveis

doi:10.2174/1386207319666160414105217

ISSN: 1386-2073
E-ISSN: 1875-5402

Decision Trees for Continuous Data and Conditional Mutual Information as a Criterion for Splitting Instances
By Georgios Drakakis, Saadiq Moledina, Charalampos Chomenidis, Philip Doganis and Haralambos Sarimveis
Source: Combinatorial Chemistry & High Throughput Screening, Volume 19, Issue 5, Jun 2016, p. 423 - 428
DOI: https://doi.org/10.2174/1386207319666160414105217
- Available online: 01 Jun 2016

Abstract

Decision trees are renowned in the computational chemistry and machine learning communities for their interpretability. Their capacity and usage are somewhat limited by the fact that they normally work on categorical data. Improvements to known decision tree algorithms are usually carried out by increasing and tweaking parameters, as well as the post-processing of the class assignment. In this work we attempted to tackle both these issues. Firstly, conditional mutual information was used as the criterion for selecting the attribute on which to split instances. The algorithm performance was compared with the results of C4.5 (WEKA’s J48) using default parameters and no restrictions. Two datasets were used for this purpose, DrugBank compounds for HRH1 binding prediction and Traditional Chinese Medicine formulation predicted bioactivities for therapeutic class annotation. Secondly, an automated binning method for continuous data was evaluated, namely Scott’s normal reference rule, in order to allow any decision tree to easily handle continuous data. This was applied to all approved drugs in DrugBank for predicting the RDKit SLogP property, using the remaining RDKit physicochemical attributes as input.

Article metrics loading...

/content/journals/cchts/10.2174/1386207319666160414105217

2016-06-01

2026-02-16

From This Site

/content/journals/cchts/10.2174/1386207319666160414105217

dcterms_title,dcterms_subject,pub_keyword

-contentType:Contributor -contentType:Concept -contentType:Institution

10

5

Full text loading...

/content/journals/cchts/10.2174/1386207319666160414105217

Article Type: Research Article

Keyword(s): Cheminformatics; Chinese Medicine; Computational chemistry; Conditional Mutual Information; Decision Trees; DrugBank

Decision Trees for Continuous Data and Conditional Mutual Information as a Criterion for Splitting Instances

Abstract

From This Site

Most Read This Month

Most Cited Most Cited RSS feed

Privileged Structures: Applications in Drug Discovery

Computational Methods in Developing Quantitative Structure-Activity Relationships (QSAR): A Review

Recent Advances on Potentiometric Membrane Sensors for Pharmaceutical Analysis

Label-Free Detection of Biomolecular Interactions Using BioLayer Interferometry for Kinetic Characterization

Metalloproteinase Inhibitors for the Disintegrin-Like Metalloproteinases ADAM10 and ADAM17 that Differentially Block Constitutive and Phorbol Ester-Inducible Shedding of Cell Surface Molecules

On Various Metrics Used for Validation of Predictive QSAR Models with Applications in Virtual Screening and Focused Library Design

Diversity Among Microbial Cyclic Lipopeptides: Iturins and Surfactins. Activity-Structure Relationships to Design New Bioactive Agents

Building a Tiered Approach to In Vitro Predictive Toxicity Screening: A Focus on Assays with In Vivo Relevance

Antioxidants and Inflammatory Disease: Synthetic and Natural Antioxidants with Anti-Inflammatory Activity

Machine Learning in Virtual Screening