SAMPL6: calculation of macroscopic pK a values from ab initio quantum mechanical free energies

Edithe Selwa; Ian M. Kenney; Oliver Beckstein; Bogdan I. Iorga

doi:10.1007/s10822-018-0138-6

SAMPL6: calculation of macroscopic pK _a values from ab initio quantum mechanical free energies

Edithe Selwa, Ian M. Kenney, Oliver Beckstein, Bogdan I. Iorga

Research output: Contribution to journal › Article › peer-review

17 Scopus citations

Abstract

Macroscopic pK_a values were calculated for all compounds in the SAMPL6 blind prediction challenge, based on quantum chemical calculations with a continuum solvation model and a linear correction derived from a small training set. Microscopic pK_a values were derived from the gas-phase free energy difference between protonated and deprotonated forms together with the Conductor-like Polarizable Continuum Solvation Model and the experimental solvation free energy of the proton. pH-dependent microstate free energies were obtained from the microscopic pK_as with a maximum likelihood estimator and appropriately summed to yield macroscopic pK_a values or microstate populations as function of pH. We assessed the accuracy of three approaches to calculate the microscopic pK_as: direct use of the quantum mechanical free energy differences and correction of the direct values for short-comings in the QM solvation model with two different linear models that we independently derived from a small training set of 38 compounds with known pK_a. The predictions that were corrected with the linear models had much better accuracy [root-mean-square error (RMSE) 2.04 and 1.95 pK_a units] than the direct calculation (RMSE 3.74). Statistical measures indicate that some systematic errors remain, likely due to differences in the SAMPL6 data set and the small training set with respect to their interactions with water. Overall, the current approach provides a viable physics-based route to estimate macroscopic pK_a values for novel compounds with reasonable accuracy.

Original language	English (US)
Pages (from-to)	1203-1216
Number of pages	14
Journal	Journal of Computer-Aided Molecular Design
Volume	32
Issue number	10
DOIs	https://doi.org/10.1007/s10822-018-0138-6
State	Published - Oct 1 2018

Keywords

Quantum chemistry
SAMPL challenge
pH
pK

ASJC Scopus subject areas

Drug Discovery
Computer Science Applications
Physical and Theoretical Chemistry

Access to Document

10.1007/s10822-018-0138-6

Cite this

@article{a1f4b2a7ae064154a87bbd734b698e77,

title = "SAMPL6: calculation of macroscopic pK a values from ab initio quantum mechanical free energies",

abstract = "Macroscopic pKa values were calculated for all compounds in the SAMPL6 blind prediction challenge, based on quantum chemical calculations with a continuum solvation model and a linear correction derived from a small training set. Microscopic pKa values were derived from the gas-phase free energy difference between protonated and deprotonated forms together with the Conductor-like Polarizable Continuum Solvation Model and the experimental solvation free energy of the proton. pH-dependent microstate free energies were obtained from the microscopic pKas with a maximum likelihood estimator and appropriately summed to yield macroscopic pKa values or microstate populations as function of pH. We assessed the accuracy of three approaches to calculate the microscopic pKas: direct use of the quantum mechanical free energy differences and correction of the direct values for short-comings in the QM solvation model with two different linear models that we independently derived from a small training set of 38 compounds with known pKa. The predictions that were corrected with the linear models had much better accuracy [root-mean-square error (RMSE) 2.04 and 1.95 pKa units] than the direct calculation (RMSE 3.74). Statistical measures indicate that some systematic errors remain, likely due to differences in the SAMPL6 data set and the small training set with respect to their interactions with water. Overall, the current approach provides a viable physics-based route to estimate macroscopic pKa values for novel compounds with reasonable accuracy.",

keywords = "Quantum chemistry, SAMPL challenge, pH, pK",

author = "Edithe Selwa and Kenney, {Ian M.} and Oliver Beckstein and Iorga, {Bogdan I.}",

note = "Publisher Copyright: {\textcopyright} 2018, Springer Nature Switzerland AG.",

year = "2018",

month = oct,

day = "1",

doi = "10.1007/s10822-018-0138-6",

language = "English (US)",

volume = "32",

pages = "1203--1216",

journal = "Journal of Computer-Aided Molecular Design",

issn = "0920-654X",

publisher = "Springer Netherlands",

number = "10",

}

TY - JOUR

T1 - SAMPL6

T2 - calculation of macroscopic pK a values from ab initio quantum mechanical free energies

AU - Selwa, Edithe

AU - Kenney, Ian M.

AU - Beckstein, Oliver

AU - Iorga, Bogdan I.

PY - 2018/10/1

Y1 - 2018/10/1

N2 - Macroscopic pKa values were calculated for all compounds in the SAMPL6 blind prediction challenge, based on quantum chemical calculations with a continuum solvation model and a linear correction derived from a small training set. Microscopic pKa values were derived from the gas-phase free energy difference between protonated and deprotonated forms together with the Conductor-like Polarizable Continuum Solvation Model and the experimental solvation free energy of the proton. pH-dependent microstate free energies were obtained from the microscopic pKas with a maximum likelihood estimator and appropriately summed to yield macroscopic pKa values or microstate populations as function of pH. We assessed the accuracy of three approaches to calculate the microscopic pKas: direct use of the quantum mechanical free energy differences and correction of the direct values for short-comings in the QM solvation model with two different linear models that we independently derived from a small training set of 38 compounds with known pKa. The predictions that were corrected with the linear models had much better accuracy [root-mean-square error (RMSE) 2.04 and 1.95 pKa units] than the direct calculation (RMSE 3.74). Statistical measures indicate that some systematic errors remain, likely due to differences in the SAMPL6 data set and the small training set with respect to their interactions with water. Overall, the current approach provides a viable physics-based route to estimate macroscopic pKa values for novel compounds with reasonable accuracy.

AB - Macroscopic pKa values were calculated for all compounds in the SAMPL6 blind prediction challenge, based on quantum chemical calculations with a continuum solvation model and a linear correction derived from a small training set. Microscopic pKa values were derived from the gas-phase free energy difference between protonated and deprotonated forms together with the Conductor-like Polarizable Continuum Solvation Model and the experimental solvation free energy of the proton. pH-dependent microstate free energies were obtained from the microscopic pKas with a maximum likelihood estimator and appropriately summed to yield macroscopic pKa values or microstate populations as function of pH. We assessed the accuracy of three approaches to calculate the microscopic pKas: direct use of the quantum mechanical free energy differences and correction of the direct values for short-comings in the QM solvation model with two different linear models that we independently derived from a small training set of 38 compounds with known pKa. The predictions that were corrected with the linear models had much better accuracy [root-mean-square error (RMSE) 2.04 and 1.95 pKa units] than the direct calculation (RMSE 3.74). Statistical measures indicate that some systematic errors remain, likely due to differences in the SAMPL6 data set and the small training set with respect to their interactions with water. Overall, the current approach provides a viable physics-based route to estimate macroscopic pKa values for novel compounds with reasonable accuracy.

KW - Quantum chemistry

KW - SAMPL challenge

KW - pH

KW - pK

UR - http://www.scopus.com/inward/record.url?scp=85051703600&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85051703600&partnerID=8YFLogxK

U2 - 10.1007/s10822-018-0138-6

DO - 10.1007/s10822-018-0138-6

M3 - Article

C2 - 30084080

AN - SCOPUS:85051703600

SN - 0920-654X

VL - 32

SP - 1203

EP - 1216

JO - Journal of Computer-Aided Molecular Design

JF - Journal of Computer-Aided Molecular Design

IS - 10

ER -