Add support for N-terminal, protonated, disulfide-bonded cysteine#2194
Merged
j-wags merged 6 commits intoJun 8, 2026
Merged
Conversation
j-wags
reviewed
Jun 8, 2026
j-wags
left a comment
Member
There was a problem hiding this comment.
Thanks @joelaforet! This looks quite good to me, but while this is just a fork the tests won't run. I'm going to retarget this PR to put the changes from your fork into a branch, then merge it, then I'll open a second PR to merge the changes from that branch into main (which will actually run the tests).
Sorry for the runaround in this process, the tests-fail-on-forks issue is downstream of us needing to test against OpenEye.
| import requests | ||
|
|
||
| r = requests.get("https://ftp.wwpdb.org/pub/pdb/data/monomers/aa-variants-v1.cif") | ||
| r = requests.get("https://files.wwpdb.org/pub/pdb/data/monomers/aa-variants-v1.cif") |
Member
There was a problem hiding this comment.
(not blocking) Oh nice, good catch!
71d4f4d
into
openforcefield:fix-nterm-disulfide-cys
5 of 22 checks passed
Merged
5 tasks
Contributor
Author
|
Thank you @j-wags ! |
j-wags
added a commit
that referenced
this pull request
Jun 8, 2026
) (#2195) * fix(proteins): generate n-terminal disulfide cysteine templates * chore(proteins): update cysteine substructure data * test(proteins): cover n-terminal cysteine disulfide ingestion * docs: note n-terminal cysteine disulfide support * fix(utils): use reachable wwPDB CIF endpoint * chore(proteins): align cysteine data with full regeneration Co-authored-by: Joe Laforet Jr. <76799675+joelaforet@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #2191.
This PR adds support for loading N-terminal, protonated, disulfide-bonded cysteine residues from PDB files.
Specifically, this covers a
CYSresidue with:NH3+SGHGSG-SGdisulfide connectivityChanges made:
utilities/make_substructure_dict/_cif_to_substructure_dict.py.aa_residues_substructures_with_caps.jsonaa_residues_substructures_explicit_bond_orders_with_caps.jsonaa_residues_substructures_explicit_bond_orders_with_caps_explicit_connectivity.jsonMolecule.from_polymer_pdb.docs/releasehistory.md.ftp.wwpdb.orgtofiles.wwpdb.org, sinceftp.wwpdb.orgdid not resolve locally but the same wwPDB file was available fromfiles.wwpdb.org.Validation run locally:
mamba run -n openff-toolkit-ncy pytest openff/toolkit/_tests/test_molecule.py -k "cys_disulfide or mainchain_cyx_dipeptide or mainchain_cys_dipeptide or mainchain_cym_dipeptide"Result:
4 passed
Additional checks:
The regenerated explicit-connectivity SMARTS library validates successfully.
The motivating protein, PDB=4CHA, now loads with Topology.from_pdb, giving 3519 atoms and 1 molecule.