Charge constraints

The charge constraints used to fit charges to the ESPs can have a substantial effect on the resulting charges. You can both add manual charge constraints yourself (described in the section below, “Custom charge constraints”) and set general options.

The key options are:

  • symmetric_methyls (default: True)

  • symmetric_methylenes (default: True)

  • symmetric_atoms_are_equivalent (default: False)

  • split_conformers (default: False)

  • constrain_methyl_hydrogens_between_conformers (default: False)

The first three options deal with symmetry. They constrain methyl hydrogens, methylene hydrogens, and symmetric atoms to have the same charge, respectively. Symmetric atoms are determined from the graph representation only, rather than from 3D coordinates. It is generally a good idea to have them all as True, although in some cases where the 3D coordinates differentiate symmetric groups, you may want to keep symmetric_atoms_are_equivalent=False.

split_conformers affects how the constraint matrices are constructed. When split_conformers=False (the default), conformers are treated separately, and equivalence constraints ensure that atoms are given the same charge across multiple molecules. When split_conformers=True, all conformers are merged into one matrix so that they are essentially averaged, without equivalence constraints needed.

When conformers are split, constrain_methyl_hydrogens_between_conformers affects whether equivalence constraints are set between conformers for methyl and methylene hydrogen atoms, for the first stage of a two-stage fit. All atoms are always given equivalence constraints between conformers during the last stage of the fit. Therefore, this option is only relevant if split_conformers=True. Using split_conformers=True, constrain_methyl_hydrogens_between_conformers=False is the approach first proposed by Cornell et al., 1993 in one of the original RESP papers, where the charges methyl/ene hydrogens are free to vary in the first stage.

However, using split_conformers=False generates charges more in line with other packages, and for this reason that is the recommended default. It also saves considerably on memory because the matrices are smaller.

Custom charge constraints

PsiRESP offers two forms of constraints: a ChargeSumConstraint and a ChargeEquivalenceConstraint.

A ChargeSumConstraint constrains one or a group of atoms to a single charge. For example, this is very helpful if you are parametrizing a residue in a large molecule, and you need the cap atoms to sum to 0.

A ChargeEquivalenceConstraint constrains every atom specified to the same charge. For example, you could manually constrain all hydrogens around a single carbon to the same charge. For this reason, a ChargeEquivalenceConstraint must contain at least two atoms.

All of these are contained in ChargeConstraintOptions. ChargeConstraintOptions are passed to a Job. As ChargeConstraintOptions may contain charge constraints for molecules that are not considered in the Job, the job will create MoleculeChargeConstraints to work on. Users should typically not interact with MoleculeChargeConstraints themselves, but let the Job handle it. MoleculeChargeConstraints contains helper methods such as adding symmetry constraints for sp3 hydrogens.

Creating a constraint requires specifying the atoms that the constraint applies to. This can be done in multiple ways. For example, intra-molecular constraints can be created from the molecule and atom indices:

In [1]: import psiresp

In [2]: nme2ala2 = psiresp.Molecule.from_smiles("CC(=O)NC(C)(C)C(NC)=O")

In [3]: constraints = psiresp.ChargeConstraintOptions()

In [4]: constraints.add_charge_sum_constraint_for_molecule(nme2ala2,
   ...:                                                    charge=0,
   ...:                                                    indices=[0, 1])
   ...: 

In [5]: print(constraints.charge_sum_constraints)
[ChargeSumConstraint(atoms={Atom(index=0, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers), Atom(index=1, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers)}, charge=0.0)]

These indices can be obtained through SMARTS matching:

In [6]: nme_smiles = "CC(=O)NC(C)(C)C([N:1]([H:2])[C:3]([H:4])([H:5])([H:6]))=O"

In [7]: nme_indices = nme2ala2.get_smarts_matches(nme_smiles)

In [8]: print(nme_indices)
[(8, 21, 9, 22, 23, 24)]

In [9]: constraints.add_charge_equivalence_constraint_for_molecule(nme2ala2,
   ...:                                                            indices=nme_indices[0])
   ...: 

Alternatively, you can pass a list of atoms. This is especially useful for inter-molecular constraints, that involve multiple molecules:

In [10]: methylammonium = psiresp.Molecule.from_smiles("C[NH3+]")

In [11]: methyl_atoms = methylammonium.get_atoms_from_smarts("C([H])([H])([H])")

In [12]: ace_atoms = nme2ala2.get_atoms_from_smarts("C([H])([H])([H])C(=O)N([H])")

In [13]: constraint_atoms = methyl_atoms[0] + ace_atoms[0]

In [14]: constraints.add_charge_sum_constraint(charge=0, atoms=constraint_atoms)

In [15]: constraints.charge_sum_constraints[-1]
Out[15]: ChargeSumConstraint(atoms={Atom(index=14, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers), Atom(index=3, molecule=Molecule(name=CH6N, charge=1) with 0 conformers), Atom(index=0, molecule=Molecule(name=CH6N, charge=1) with 0 conformers), Atom(index=1, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers), Atom(index=0, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers), Atom(index=13, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers), Atom(index=2, molecule=Molecule(name=CH6N, charge=1) with 0 conformers), Atom(index=3, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers), Atom(index=12, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers), Atom(index=2, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers), Atom(index=4, molecule=Molecule(name=CH6N, charge=1) with 0 conformers), Atom(index=11, molecule=Molecule(name=C7H14N2O2, charge=0) with 0 conformers)}, charge=0.0)

You can also indirectly add constraints with the symmetric_methylenes and symmetric_methyls terms. These add a ChargeEquivalenceConstraint for the appropriate hydrogens.

Note

For now, detecting sp3 carbons requires accurate chemical perception. For reliable symmetry detection, it is highly advisable to create Molecules from SMILES, RDKit molecules, or QCElemental molecules with the connectivity specified.

While the actual constraints are not created in ChargeConstraintOptions, they are specified in MoleculeChargeConstraints. MoleculeChargeConstraints are created by a Job; users should not typically create their own or interact with it. They contain methods for detecting and merging redundant constraints. For example, we create constraint options where a constraint for nme2ala2 is added twice, and a constraint is added that includes atoms from both nme2ala2 and methylammonium:

In [16]: constraints = psiresp.ChargeConstraintOptions(symmetric_methyls=True,
   ....:                                               symmetric_methylenes=True)
   ....: 

# add this constraint twice
In [17]: constraints.add_charge_sum_constraint_for_molecule(nme2ala2,
   ....:                                                    indices=nme_indices[0])
   ....: 

In [18]: constraints.add_charge_sum_constraint_for_molecule(nme2ala2,
   ....:                                                    indices=nme_indices[0])
   ....: 

# add constraint with both nme2ala2 and methylammonium
In [19]: constraints.add_charge_sum_constraint(charge=0, atoms=constraint_atoms)

In [20]: print(len(constraints.charge_sum_constraints))
3

In [21]: print(len(constraints.charge_equivalence_constraints))
0

When we create MoleculeChargeConstraints with only the nme2ala2 molecule, the redundant constraint is removed:

In [22]: mol_constraints = psiresp.charge.MoleculeChargeConstraints.from_charge_constraints(
   ....:                     constraints,
   ....:                     molecules=[nme2ala2],
   ....:                     )
   ....: 
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Input In [22], in <cell line: 1>()
----> 1 mol_constraints = psiresp.charge.MoleculeChargeConstraints.from_charge_constraints(
      2                     constraints,
      3                     molecules=[nme2ala2],
      4                     )

File ~/checkouts/readthedocs.org/user_builds/psiresp/conda/v0.4.2/lib/python3.8/site-packages/psiresp-0.4.2+0.gd7d2919.dirty-py3.8.egg/psiresp/charge.py:363, in MoleculeChargeConstraints.from_charge_constraints(cls, charge_constraints, molecules)
    351 eqvs = [constr.copy(deep=True)
    352         for constr in charge_constraints.charge_equivalence_constraints
    353         if constr.molecule_set & molecule_set]
    355 dct = {
    356     k: getattr(charge_constraints, k)
    357     for k in [
   (...)
    360     ]
    361 }
--> 363 constraints = cls(
    364     charge_sum_constraints=sums,
    365     charge_equivalence_constraints=eqvs,
    366     molecules=molecules,
    367     **dct
    368 )
    370 accepted = []
    371 if charge_constraints.symmetric_methyls:

File ~/checkouts/readthedocs.org/user_builds/psiresp/conda/v0.4.2/lib/python3.8/site-packages/psiresp-0.4.2+0.gd7d2919.dirty-py3.8.egg/psiresp/base.py:42, in Model.__init__(__pydantic_self__, **data)
     40 __pydantic_self__.__pre_init__(**data)  # lgtm[py/init-calls-subclass]
     41 super().__init__(**data)
---> 42 __pydantic_self__.__post_init__(**data)  # lgtm[py/init-calls-subclass]

File ~/checkouts/readthedocs.org/user_builds/psiresp/conda/v0.4.2/lib/python3.8/site-packages/psiresp-0.4.2+0.gd7d2919.dirty-py3.8.egg/psiresp/charge.py:308, in MoleculeChargeConstraints.__post_init__(self, **kwargs)
    306 self.clean_charge_sum_constraints()
    307 self.clean_charge_equivalence_constraints()
--> 308 self._generate_molecule_increments()

File ~/checkouts/readthedocs.org/user_builds/psiresp/conda/v0.4.2/lib/python3.8/site-packages/psiresp-0.4.2+0.gd7d2919.dirty-py3.8.egg/psiresp/charge.py:324, in MoleculeChargeConstraints._generate_molecule_increments(self)
    322     self._molecule_increments[hash(mol)] = inc_list
    323 self._n_total_atoms = increment
--> 324 self._generate_edges()

File ~/checkouts/readthedocs.org/user_builds/psiresp/conda/v0.4.2/lib/python3.8/site-packages/psiresp-0.4.2+0.gd7d2919.dirty-py3.8.egg/psiresp/charge.py:328, in MoleculeChargeConstraints._generate_edges(self)
    326 def _generate_edges(self):
    327     increments = list(self._molecule_increments.values())
--> 328     self._n_molecule_atoms = np.array([x[0] for x in increments] + [self._n_total_atoms])
    329     self._edges = []
    330     for mol in self.molecules:

File ~/checkouts/readthedocs.org/user_builds/psiresp/conda/v0.4.2/lib/python3.8/site-packages/psiresp-0.4.2+0.gd7d2919.dirty-py3.8.egg/psiresp/charge.py:328, in <listcomp>(.0)
    326 def _generate_edges(self):
    327     increments = list(self._molecule_increments.values())
--> 328     self._n_molecule_atoms = np.array([x[0] for x in increments] + [self._n_total_atoms])
    329     self._edges = []
    330     for mol in self.molecules:

IndexError: list index out of range

In [23]: print(len(mol_constraints.charge_sum_constraints))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [23], in <cell line: 1>()
----> 1 print(len(mol_constraints.charge_sum_constraints))

NameError: name 'mol_constraints' is not defined

And the sp3 equivalences are added:

In [24]: print(len(mol_constraints.charge_equivalence_constraints))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [24], in <cell line: 1>()
----> 1 print(len(mol_constraints.charge_equivalence_constraints))

NameError: name 'mol_constraints' is not defined