Use throughput data for Conformer RMSD benchmark#187
Open
scal444 wants to merge 7 commits into
Open
Conversation
Replace the per-mol, hardcoded-SMILES benchmark with a batch-mode bench that: * Loads a slice of SMILES from a file (via bench_utils.load_smiles). * Embeds one base conformer per mol in parallel and jitters via the shared bench_utils.embed_and_jitter (with add_hs so ETKDG sees a chemically reasonable graph). * Times a single GetConformerRMSMatrixBatch call vs a serial RDKit loop. * Sweeps confs_per_mol with a single embed run plus _slice_to_confs reuse, so every row sees the same molecule selection. * Validates GPU output against RDKit (per-pair, with a tolerance) before timing and aborts on mismatch. * Honors --rdkit_max_seconds for the RDKit comparison and --no-rdkit / --no-nvmolkit for mode selection.
Cosmetic only; matches the formatting style used in adjacent benches.
evasnow1992
reviewed
Jun 1, 2026
| samples = [bench_rdkit_batch(payloads, rdkit_max_seconds) for _ in range(3)] | ||
| samples.sort(key=lambda pair: pair[0] / max(pair[1], 1)) | ||
| rdkit_time_s, rdkit_done = samples[len(samples) // 2] | ||
| rdkit_std_s = statistics.stdev([elapsed for elapsed, _ in samples]) if len(samples) > 1 else 0.0 |
Collaborator
There was a problem hiding this comment.
Looks like there may be an inconsistency between the median and std calculations. Since pair[0] is normalized by pair[1] when selecting the median (rdkit_time_s), that value appears to represent per-molecule time. However, rdkit_std_s is still computed directly from the raw elapsed times?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Before, it had been based on single smiles. Modified to be more like our other benchmarks, and added RDKit early exit. The early exit is a bit annoying right now because we can't use
time_it, added a tracking bug