
Full text loading...
Computer-assisted drug design is used to increase the chances of finding valuable drug candidates, by applying a wide range of computational methods, such as machine learning, structure-activity relationships, quantitative structure-activity relationships, molecular mechanics, quantum mechanics, molecular dynamics, and drug-protein docking. Machine learning is an important field of artificial intelligence, and includes a diversity of methods and algorithms that extract rules and functions from large datasets. The most important algorithms are linear discriminant analysis, artificial neural networks, decision trees, lazy learning, k-nearest neighbors, Bayesian methods, Gaussian processes, support vector machines, and kernel algorithms. This special issue presents a representative selection of machine learning applications for the virtual screening of chemical libraries. In the opening paper, Melville, Burke and Hirst review recent applications of machine learning techniques in ranking chemical libraries based on their biological activity against a particular protein target. Applications of ligand-based similarity searching and structure-based docking are critically evaluated, with an accent on the major algorithms, such as decision trees, naïve Bayesian classifiers, artificial neural networks, and support vector machines. Chen et al. examine the technical aspects of ligand-based virtual screening, such as available software, molecular descriptors, and performance measures. The procedures reviewed include binary kernel discrimination, k-nearest neighbors, linear discriminant analysis, logistic regression, and probabilistic neural networks. The detailed comparison of various studies is especially valuable in providing an estimate of the level of success that may be expected in virtual screening. The comparison of various machine learning techniques is further explored by Plewczynski, Spieser and Koch in a large-scale evaluation of the screening success. Based on the biological targets explored in the literature, it was found that there is no machine learning approach that consistently provides the best results. Thorough careful tuning of parameters, most chemical libraries may be modeled with existing algorithms. The study found that a promising class of methods is represented by fusion (or ensemble) classifiers, which combine predictions from several models and are thus able to outperform single classifiers.