MutaBind2 Method

MutaBind2 evaluates the change in binding affinity between proteins (or protein chains) caused by single-site mutations and multiple-site mutations in their sequence. The predictions are based on the structure of the protein-protein complex.

The MutaBind2 model uses molecular mechanics force fields, statistical potentials and fast side-chain optimization algorithms built via random forest (RF) method. The training set used for developing single and multiple mutation model includes 4191 single mutations from 265 protein complexes and 1707 multiple mutations from 120 protein complexes, respectively.

The MutaBind2 structure optimization protocol. For the forward mutation datasets (ΔΔGwt→mut), the structure optimization protocol was the same as the one used in MutaBind (1). We used the BuildModel module of FoldX (2) to introduce single or multiple point mutations on the wild-type crystal structure obtained from the Protein Data Bank (PDB) (3). Next we added missing heavy side-chain and hydrogen atoms by VMD program (4) using topology parameters of CHARMM36 force field (5). After that we performed a 100-step energy minimization in the gas phase for both wild-type and mutant complex structures applying harmonic restraints with the force constant of 5 kcal mol-1 Å-2) on the backbone atoms of all residues. The energy minimization was carried out by NAMD program version 2.9 (6) using the force field CHARMM36 (5). A 12 Å cutoff distance for nonbonded interactions was applied to the systems. Lengths of hydrogen-containing bonds were constrained by the SHAKE algorithm (7). For the reverse mutation datasets, we modeled the mutant structures with Modeller software (8) using wild-type crystal structures as the templates. To minimize the error introduced by structural modelling, only mutated protein chain was modeled for single mutations, and for multiple mutations on different protein chains, the whole complex was modelled. The model was discarded if the root-mean-square deviation of all aligned Cα atoms between any of the modelled chains and the template was larger than 2 Å. Then RepairPDB module was applied to further optimize the structure and mutations were introduced using the BuildModel module from FoldX. After that a 1000-step energy minimization in the gas phase was carried out for both wild-type and mutant models with harmonic restraints (with the force constant of 5 kcal mol-1 Å-2) applied on the backbone atoms of all residues using NAMD. Minimization was done for the whole protein complex.

The MutaBind2 energy function include seven distinct energy features for single and multiple mutations respectively and the contribution of each term to the RF is shown in the table.

Single Mutation
Multiple Mutation
  • ΔΔEvdw is the change of van der Waals interaction energy upon a single or multiple mutation(s) (ΔΔEvdw = ΔEvdwmut - ΔEvdwwt). ΔEvdw is calculated as a difference between van der Waals energies of a complex and each interacting partner using ENERGY module of CHARMM (9). The minimized structure of wild-type or mutant complex structure was used for the calculation.
  • ΔΔGsolv approximates the change of polar solvation energy upon mutation(s) (ΔΔGsolv = ΔGsolvmut - ΔGsolvwt). ΔGsolv is obtained from numerically solving the Poisson-Boltzmann (PB) equation with PBEQ module (10) of CHARMM program using the minimized structure of wild-type or mutant complex. For the PB calculation, dielectric constants ε = 2 for the protein interior and ε = 80 for the exterior aqueous environment were used.
  • ΔΔGfold is the change of stability of protein complex upon mutation(s) (ΔΔGfold = ΔGfoldmut - ΔGfoldwt) where each term is defined as the unfolding free energy of mutant and wild-type protein complexes. It is calculated with BuildModel module of FoldX software (2) which uses empirical force field. This term may account for those cases where mutated proteins are unfolded in unbound states and can only fold upon binding to its partner.
  • SAcomwt and SApartwt are solvent accessible surface areas of the mutated residues in the unbound partner and complex structure respectively. These terms are calculated by DSSP program (11) using crystal structure of the wild-type complex. For multiple mutations this term is calculated as a sum of solvent accessible surface areas of all mutated residues.
  • CS is the change of evolutionary conservation of a mutated site upon introducing mutations calculated using PROVEAN program (12). This is used to account for the fact that site can be evolutionary conserved because it is important for interactions with other proteins and any change in this site may affect its function in a detrimental way. For multiple mutations this term is calculated by summing up CS for all mutations.

A scoring function for single mutations included an additional term, Ncontwt,a number of contact residues between one partner where a mutation was introduced and another partner. If any heavy atom of a residue in one partner is located within 10 Å from any heavy atom of another partner, we defined this residue as a contact residue. A scoring function for multiple mutations included an additional term ΔEvdwwt calculated as a difference between van der Waals energies of a complex and each interacting partner for the wild-type structure, as described above.

1. Li, M., Simonetti, F.L., Goncearenco, A. and Panchenko, A.R. (2016) MutaBind estimates and interprets the effects of sequence variants on protein-protein interactions. Nucleic Acids Res, 44, W494-501.
2. Guerois, R., Nielsen, J.E. and Serrano, L. (2002) Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol, 320, 369-387.
3. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N. and Bourne, P.E. (2000) The Protein Data Bank. Nucleic Acids Res, 28, 235-242.
4. Humphrey, W., Dalke, A. and Schulten, K. (1996) VMD: visual molecular dynamics. Journal of molecular graphics, 14, 33-38, 27-38.
5. MacKerell, A.D., Bashford, D., Bellott, M., Dunbrack, R.L., Evanseck, J.D., Field, M.J., Fischer, S., Gao, J., Guo, H., Ha, S. et al. (1998) All-atom empirical potential for molecular modeling and dynamics studies of proteins. The journal of physical chemistry. B, 102, 3586-3616.
6. Phillips, J.C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R.D., Kale, L. and Schulten, K. (2005) Scalable molecular dynamics with NAMD. Journal of Computational Chemistry, 26, 1781-1802.
7. Hoover, W.G. (1985) Canonical dynamics: Equilibrium phase-space distributions. Phys Rev A, 31, 1695-1697.
8. Sali, A. and Blundell, T.L. (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol, 234, 779-815.
9. Brooks, B.R., Bruccoleri, R.E., Olafson, B.D., States, D.J., Swaminathan, S. and Karplus, M. (1983) Charmm - a Program for Macromolecular Energy, Minimization, and Dynamics Calculations. Journal of Computational Chemistry, 4, 187-217.
10. Im, W., Beglov, D. and Roux, B. (1998) Continuum Solvation Model: computation of electrostatic forces from numerical solutions to the Poisson-Boltzmann equation. Computer Physics Communications, 111, 59-75.
11. Joosten, R.P., te Beek, T.A., Krieger, E., Hekkelman, M.L., Hooft, R.W., Schneider, R., Sander, C. and Vriend, G. (2011) A series of PDB related databases for everyday needs. Nucleic Acids Res, 39, D411-419.
12. Choi, Y., Sims, G.E., Murphy, S., Miller, J.R. and Chan, A.P. (2012) Predicting the functional effect of amino acid substitutions and indels. Plos One, 7, e46688.

More details can be found in the paper.

Zhang N, Chen Y, Lu H, Zhao F, Alvarez RV, Goncearenco A, Panchenko AR, Li M. MutaBind2: Predicting the Impacts of Single and Multiple Mutations on Protein-Protein Interactions. iScience. PMID: 32169820

School of Biology & Basic Medical Sciences, Soochow University
199 Ren-Ai Road, Suzhou, Jiangsu, 215123 P.R. China
Contact Us