Research Topics

Computational Chemistry & Drug Design

Our research area encompasses biomolecular simulation and modelling, and rational drug design. Our group develops and applies computational methods to study molecular interactions in biological systems, and uses this knowledge to design molecules which modulate targets of pharmaceutical relevance, in close multi-disciplinary collaboration with experimental researchers. We are especially interested in accounting for receptor flexibility in computer-aided drug discovery, and quantum mechanical calculations in biomolecular systems. To achieve our goals we use molecular dynamics simulations, binding free energy calculation, quantum mechanical methods, comparative modelling, high-throughput docking and chemoinformatics tools. We benefit highly from our computational resources which include high-performance computing, GPU clusters, and state-of-the-art visualization tools.

I. QM simulations in biomacromolecules
II. Biomolecular modelling
III. Computer-aided drug discovery

I. QM simulations in biomacromolecules

QM calculation of binding free energy: The MM-QMSA method

The prediction of binding free-energies is an area of active research and still a challenge in computational biophysics (Cavasotto, 2012, in press). Accurate yet efficient quantitative calculations are critical in computer-aided drug design, to study protein-protein association, and in understanding molecular recognition at an atomic level. Our Lab has introduced a QM-based end-point method for calculating binding free energy in ligand-protein systems termed MM-QMSA (originally MM/QM-COSMO), which provides an enhanced description of electrostatic interactions while retaining conceptual simplicity and computational efficiency (Anisimov and Cavasotto, 2011b). After extensive conformational sampling through MD simulations, the total energy is re-evaluated using a semi-empirical Hamiltonian within a linear-scaling framework, which allows it to perform consistent QM calculations on the whole system in an affordable time frame. Translational and rotational entropy contributions are calculated through their corresponding configurational integrals, and the internal (vibrational) entropy is evaluated through normal mode analysis. We have shown the feasibility of this approach in studying the binding of phosphopeptides to the human SH2 domain of LCK (Anisimov and Cavasotto, 2011b), and to the C-terminal domain of BRCA1 (Anisimov et al., 2011). We hypothesize that classical end-point methods could be sensitive to the lack of force-field charge transferability, since ligands and proteins each experience very different environments in the unbound and bound conditions, and thus one would need to use a different set of charges to account for this change. In contrast, QM provides a better balance between the electrostatic intermolecular interaction energy in the complex and the solvation energy calculated using a continuum solvent model.

Hydration free energies using semiempirical hamiltonians: Optimization of COSMO radii for AM1, RM1, PM3 and PM5.

Motivated by the long-term goal of simulating biomacromolecules in an aqueous environment using quantum mechanical (QM) methods, we undertook the optimization of the COnductor-like Screening MOdel (COSMO) atomic radii and atomic surface tension coefficients for different semi-empirical Hamiltonians adhering to the same computational conditions recently followed in the simulation of biomolecular systems. This optimization was achieved by reproducing experimental hydration free energies of a set consisting of 507 neutral and 99 ionic molecules (Anisimov and Cavasotto, 2011a). The calculated hydration free energies were significantly improved by introducing a multiple atomic-type scheme that reflects different chemical environments in the calculation of the non-polar contribution according to the scaled particle Claverie-Pierotti formalism. Separate radii and surface tension coefficients sets have been developed for AM1, PM3, PM5, and RM1 semi-empirical Hamiltonians. Confirmed by cross-validation, the developed implicit solvent model is characterized by an average unsigned error of ~1 kcal/mol (~0.65 for neutral molecules) in reproducing hydration free energy for the entire set of 606 small molecules including 507 neutral and 99 ionic molecules for which experimental data are available. This performance is better than our earlier calculation using force-fields (Bordner, Cavasotto and Abagyan, 2003). Hydration free energy calculation of each molecule took on average 0.5 sec on a single processor. The obtained level of accuracy of the implicit model COSMO at semi-empirical quantum mechanical level of theory is comparable to other implicit solvent models at similar or higher QM levels of theory. This enables a consistent application of a quantum mechanical level of theory from small molecules to large biological macromolecules, what expands the range of chemical and biological applications in condensed phase.

QM dynamics of charge transfer

Classical mechanics-based methods are the major theoretical tool in modern biomolecular simulations. However, their application is limited to classical systems that do not exhibit strong quantum mechanical (QM) effects, such as intermolecular charge transfer (CT). Since biological macromolecules are ionized at physiological conditions, a proper understanding of the CT effect is of the utmost importance to understand biomolecular systems in solution (Anisimov and Cavasotto, 2010). Although earlier QM studies showed the importance of QM effects, they were based on “a posteriori” QM calculations on classical trajectories. To take a further step in the application of more accurate methods to study CT effects, we performed 20 ps of full QM molecular dynamics (QM MD) of ubiquitin in an explicit solvent water droplet (~12,200 atoms), using semi-empirical PM3 and PM5 Hamiltonians (Anisimov, Bugaenko and Cavasotto, 2009). The QM MD simulation revealed dynamic CT channels in salt bridges, the ability of neutral amino acids to hold large excess charge, and the increase of CT with the number of ionized units present in the system. The ionized nature of biological systems and the resulting dynamic exchange of significant amounts of electric charge drastically differentiate biological systems from neat liquids, thus creating a new previously unaccounted dimension that requires application of QM MD to access atomic-level resolution.

Related Publications

Anisimov, V. M., Bugaenko, V. L., and Cavasotto, C. N. (2009). Quantum mechanical dynamics of charge transfer in ubiquitin in aqueous solution. ChemPhysChem 10, 3194-3196.

Anisimov, V. M., and Cavasotto, C. N. (2010). Quantum-Mechanical Molecular Dynamics of Charge Transfer. In Kinetics and Dynamics, P. Paneth, and A. Dybala-Defratyka (eds), 247-266: Springer Netherlands.

Anisimov, V. M., and Cavasotto, C. N. (2011a). Hydration Free Energies Using Semiempirical Quantum Mechanical Hamiltonians and a Continuum Solvent Model with Multiple Atomic-Type Parameters. J. Phys. Chem. B 115, 7896-7905.

Anisimov, V. M., and Cavasotto, C. N. (2011b). Quantum Mechanical Binding Free-energy Calculation for Phosphopeptide Inhibitors of the Lck SH2 Domain. J. Comput. Chem. 32, 2254-2263.

Anisimov, V. M., Ziemys, A., Kizhake, S., Yuan, Z., Natarajan, A., and Cavasotto, C. N. (2011). Computational and experimental studies of the interaction between phospho-peptides and the C-terminal domain of BRCA1. J Comput Aided Mol Des 25, 1071-1084.

Bordner, A. J., Cavasotto, C. N., and Abagyan, R. A. (2003). Direct derivation of van der Waals force field parameters from quantum mechanical interaction energies. J. Phys. Chem. B 107, 9601-9609.

Cavasotto, C. N. (2012, in press). Binding free energy calculations and scoring in small-molecule docking. In Physico-Chemical and Computational Approaches to Drug Discovery, F. J. Luque, and X. Barril (eds). London: Royal Society of Chemistry.

II. Biomolecular modelling

Modelling ligand-protein interaction

Characterizing ligand-protein interaction is important to relate structure and dynamics with biological function. Using Molecular Dynamics (MD) simulations, we assessed the molecular basis for the agonicity and antagonicity in the androgen receptor (AR) (Bisson, Abagyan and Cavasotto, 2008). A complementary approach is the use of Monte Carlo sampling in the torsional space, what we use for full flexible ligand docking (i.e., flexible ligand and receptor). This approach was used to study several problems: i) redocking of retinal into the binding site of bovine rhodopsin (Cavasotto, Orry and Abagyan, 2003); ii) structural characterization of ligand-protein complexes between non-native ligands and protein kinases PKA, p38, CDK-2 and LCK (Cavasotto and Abagyan, 2004); iii) determination of the binding mode of scalaradial to phospholipase A(2) (PLA2) (Monti et al., 2007); iv) assessment of the structural determinants of RXR antagonism (Cavasotto et al., 2004). The full-flexible docking is the foundation of the ligand-steered modeling method (see below). In studying the binding of petrosaspongiolide to phospholipase A(2), we also employed a mixed approach of Monte Carlo sampling followed by MD simulation for refinement (Monti et al., 2009). At a methodological level, current efforts are also geared towards efficient simulations with constraints (Echenique et al., 2011a; Echenique, Cavasotto and García-Risueño, 2011b).

The ligand-steered homology method (LSHM)

Homology modelling methods rely on sequence similarity to a structural template, from which the backbone structure is inherited. The quality of homology models depends on target-template sequence identity, the accuracy of the alignment, the choice and quality of the template, and the structural refinement methods (Cavasotto, 2011; Cavasotto and Phatak, 2009). It would be natural to incorporate information about known ligands in the process of modelling binding sites. In the ligand-steered homology method (LSHM) (Cavasotto et al., 2008; Orry and Cavasotto, 2006), existing ligands are used to shape and optimize the binding site during the modelling process, through a full flexible docking protocol. A ligand-steered model of the class A GPCR Melanin Concentrating Hormone receptor 1 (MCHR1) was used for high-throughput docking, and after experimental evaluation of the top-ranking molecules, 6 low-micromolar novel chemotype antagonists were found (Cavasotto et al., 2008). Later on, the LSHM was systematically benchmarked on existing GPCR crystal structures. It was found that that ligand-steered models reproduced the experimental binding modes very well, and that those models perform consistently better in HTD than unrefined homology models (Phatak, Gatica and Cavasotto, 2010). Moreover, it was found that the performance of top ligand-steered models in selectivity prediction was comparable to crystal structures (Phatak et al., 2010). The LSHM protocol was also used to rationalize the binding of antagonists (Diaz et al., 2009a) and agonists (Diaz et al., 2009b) to the cannabinoid 2 receptor.

Related Publications

Bisson, W. H., Abagyan, R., and Cavasotto, C. N. (2008). Molecular basis of agonicity and antagonicity in the androgen receptor studied by molecular dynamics simulations. J. Mol. Graphics Modell. 27, 452-458.

Cavasotto, C. N. (2011). Homology models in docking and high-throughput docking. Curr. Top. Med. Chem. 11, 1528-1534.

Cavasotto, C. N., and Abagyan, R. A. (2004). Protein flexibility in ligand docking and virtual screening to protein kinases. J. Mol. Biol. 337, 209-225.

Cavasotto, C. N., Liu, G., James, S. Y., et al. (2004). Determinants of retinoid X receptor transcriptional antagonism. J. Med. Chem. 47, 4360-4372.

Cavasotto, C. N., Orry, A. J., and Abagyan, R. A. (2003). Structure-based identification of binding sites, native ligands and potential inhibitors for G-protein coupled receptors. Proteins 51, 423-433.

Cavasotto, C. N., Orry, A. J., Murgolo, N. J., et al. (2008). Discovery of novel chemotypes to a G-protein-coupled receptor through ligand-steered homology modeling and structure-based virtual screening. J. Med. Chem. 51, 581-588.

Cavasotto, C. N., and Phatak, S. S. (2009). Homology modeling in drug discovery: current trends and applications. Drug Discovery Today 14, 676-683.

Diaz, P., Phatak, S. S., Xu, J., Astruc-Diaz, F., Cavasotto, C. N., and Naguib, M. (2009a). 6-Methoxy-N-alkyl isatin acylhydrazone derivatives as a novel series of potent selective cannabinoid receptor 2 inverse agonists: Design, Synthesis and Binding Mode Prediction. J. Med. Chem. 52, 433-444.

Diaz, P., Phatak, S. S., Xu, J., et al. (2009b). 2,3-Dihydro-1-benzofuran derivatives as a series of potent selective cannabinoid receptor 2 agonists: design, synthesis, and binding mode prediction through ligand-steered modeling. ChemMedChem 4, 1615-1629.

Echenique, P., Cavasotto, C. N., De Marco, M., Garca-Risueno, P., and Alonso, J. L. (2011a). An exact expression to calculate the derivatives of position-dependent observables in molecular simulations with flexible constraints. PLoS One 6, e24563.

Echenique, P., Cavasotto, C. N., and García-Risueño, P. (2011b). The canonical equilibrium of constrained molecular models. Eur. Phys. J. Special Topics 200, 5-54.

Monti, M. C., Casapullo, A., Cavasotto, C. N., Napolitano, A., and Riccio, R. (2007). Scalaradial, a Dialdehyde-Containing Marine Metabolite That Causes an Unexpected Noncovalent PLA(2) Inactivation. ChemBioChem 8, 1585-1591.

Monti, M. C., Casapullo, A., Cavasotto, C. N., et al. (2009). The binding mode of petrosaspongiolide M to the human group IIA phospholipase A(2): exploring the role of covalent and noncovalent interactions in the inhibition process. Chem.-Eur. J. 15, 1155-1163.

Orry, A. J. W., and Cavasotto, C. N. (2006). Ligand-docking-based homology model of the Melanin-Concentrating Hormone 1 receptor. In 231st Meeting of the American Chemical Society. Atlanta, GA.

Phatak, S. S., Gatica, E. A., and Cavasotto, C. N. (2010). Ligand-steered modeling and docking: A benchmarking study in Class A G-Protein-Coupled Receptors. J. Chem. Inf. Model. 50, 2119-2128.

III. Computer-aided Drug Discovery

Ligand and decoy sets for docking to GPCRs

Although considerable efforts are devoted to the design of chemical libraries for actual drug discovery campaigns (Cavasotto and Phatak, 2011; Orry, Abagyan and Cavasotto, 2006), the development and benchmarking of HTD protocols using either experimental structures or homology models also requires an adequate set of ligands and decoys. The necessity of the latter is dictated by the need to avoid artificial enrichment by generating sets of decoys with similar physical properties to the ligands, but chemically diverse. The first condition avoids the artificially high enrichment caused by the separation of simple physical properties, whereas the second reduces the likelihood of the decoy sets containing actual binders (which would artificially lower the enrichment). To complement existing decoy libraries, such as DUD and VDS, we compiled a GPCR ligand library (GLL) of 25,145 ligands for 147 GPCR targets, generating for each of them an associated decoy set with a 39 molecules per ligand ratio, collected in the GPCR Decoy Database (GDD) containing ~1 million molecules (Gatica and Cavasotto, 2011). Decoys were generated ensuring a ligand-decoy similarity of six physical properties, while enforcing ligand-decoy chemical dissimilarity. The performance in docking of the GDD was evaluated on 19 GPCR targets and compared to other decoy sets, finding a significant decrease in enrichment when bias-corrected decoy sets from GDD are used. The use of different charge-related physical properties in decoy set generation was also investigated. Both the GLL and GDD are freely available for the scientific community (Go to the GLL/GDD webpage).

Conformational flexibility in docking

Explicitly accounting for target flexibility in docking still constitutes a difficult challenge due to the high dimensionality of the conformational space to be sampled. This especially applies to the high-throughput scenario, where the screening of hundreds of thousands compounds takes place. The use of Receptor Ensemble Docking (RED) (Cavasotto, 2011a; Cavasotto and Singh, 2008), where multiple target conformations are used to perform docking in a sequential fashion, is a simple but powerful approach that allows to incorporate binding site structural diversity in the docking process. We have successfully used this methodology for docking onto protein kinases (Cavasotto and Abagyan, 2004) and GPCRs (Vilar et al., 2011). Whenever enough experimental structures to build a diverse ensemble are not available, normal mode analysis provides an appealing and efficient approach to in silico generate multiple receptor conformations by distortion along a combination of few low-frequency modes (relevant modes) that represent collective mid- and large-scale displacements (Cavasotto, 2012; Cavasotto, Kovacs and Abagyan, 2005; Kovacs, Cavasotto and Abagyan, 2005). In this way, the dimension of the conformational space to be sampled is heavily reduced. This methodology is especially suited to incorporate target flexibility at the backbone level.

Structure-based Drug Discovery

Although high-throughput screening is the predominant starting point for discovery programs, in silico methods have gradually made inroads by their more rational approach, to expedite the drug discovery and devel¬opment process (Phatak, Stephan and Cavasotto, 2009). In docking-based virtual screening, a chemical library of existing ligands is docked into the target binding site in a high-throughput fashion and scored with the aim to generate a sub-library rich in potential binders (Cavasotto and Orry, 2007). While high-throughput docking is plagued with both false-positives and false-negatives, there are many reasons that offset its limitations: low set-up cost, computational speed, no need of pre-existing ligand library, potential novelty of discovered ligands, possibility to incorporate at any stage of the process filters accounting for drug-likeness. Among other targets that currently are being investigated, we were successful in finding novel ligand chemotypes in the low micromolar range for protein kinase EGFR (Cavasotto et al., 2006), and the class A GPCR Melanin Concentrating Hormone 1 receptor (Cavasotto et al., 2008).

Related Publications

Cavasotto, C. N. (2011a). Handling protein flexibility in docking and high-throughput docking. In Virtual Screening. Principles, Challenges and Practical Guidelines, C. Sotriffer (ed), 245-262: Wiley-VCH Verlag.

Cavasotto, C. N. (2012). Normal mode-based approaches in receptor ensemble docking. Methods Mol. Biol. 819,157-168 (2012).

Cavasotto, C. N., and Abagyan, R. A. (2004). Protein flexibility in ligand docking and virtual screening to protein kinases. J. Mol. Biol. 337, 209-225.

Cavasotto, C. N., Kovacs, J. A., and Abagyan, R. A. (2005). Representing Receptor Flexibility in Ligand Docking through Relevant Normal Modes. J. Am. Chem. Soc. 127, 9632-9640.

Cavasotto, C. N., and Orry, A. J. (2007). Ligand Docking and Structure-based Virtual Screening in Drug Discovery. Curr. Top. Med. Chem. 7, 1006-1014.

Cavasotto, C. N., Orry, A. J., Murgolo, N. J., et al. (2008). Discovery of novel chemotypes to a G-protein-coupled receptor through ligand-steered homology modeling and structure-based virtual screening. J. Med. Chem. 51, 581-588.

Cavasotto, C. N., Ortiz, M. A., Abagyan, R. A., and Piedrafita, F. J. (2006). In silico identification of novel EGFR inhibitors with antiproliferative activity against cancer cells. Bioorg. Med. Chem. Lett. 16, 1969-1974.

Cavasotto, C. N., and Phatak, S. S. (2011). Docking methods for structure-based library design. Methods Mol. Biol. 685, 155-174.

Cavasotto, C. N., and Singh, N. (2008). Docking and High Throughput Docking: Successes and the Challenge of Protein Flexibility Curr. Comput.-Aided Drug Design 4, 221-234.

Gatica, E. A., and Cavasotto, C. N. (2011). Ligand and Decoy Sets for Docking to G Protein-Coupled Receptors. J Chem Inf Model.

Kovacs, J. A., Cavasotto, C. N., and Abagyan, R. A. (2005). Conformational Sampling of Protein Flexibility in Generalized Coordinates: Application to ligand docking. J. Comp. Theor. Nanosci. 2, 354-361.

Orry, A. J., Abagyan, R. A., and Cavasotto, C. N. (2006). Structure-based development of target-specific compound libraries. Drug Discovery Today 11, 261-266.

Phatak, S. S., Stephan, C. C., and Cavasotto, C. N. (2009). High-throughput and in silico screenings in drug discovery. Exp. Opin. Drug Discov. 4, 947-959.

Vilar, S., Ferino, G., Phatak, S. S., Berk, B., Cavasotto, C. N., and Costanzi, S. (2011). Docking-based virtual screening for GPCRs ligands: not only crystal structures but also in silico models. J. Mol. Graphics Modell. 29, 614-623.