Use throughput data for Conformer RMSD benchmark by scal444 · Pull Request #187 · NVIDIA-BioNeMo/nvMolKit

scal444 · 2026-06-01T13:14:52Z

Before, it had been based on single smiles. Modified to be more like our other benchmarks, and added RDKit early exit. The early exit is a bit annoying right now because we can't use time_it, added a tracking bug

Replace the per-mol, hardcoded-SMILES benchmark with a batch-mode bench that: * Loads a slice of SMILES from a file (via bench_utils.load_smiles). * Embeds one base conformer per mol in parallel and jitters via the shared bench_utils.embed_and_jitter (with add_hs so ETKDG sees a chemically reasonable graph). * Times a single GetConformerRMSMatrixBatch call vs a serial RDKit loop. * Sweeps confs_per_mol with a single embed run plus _slice_to_confs reuse, so every row sees the same molecule selection. * Validates GPU output against RDKit (per-pair, with a tolerance) before timing and aborts on mismatch. * Honors --rdkit_max_seconds for the RDKit comparison and --no-rdkit / --no-nvmolkit for mode selection.

…ration

Cosmetic only; matches the formatting style used in adjacent benches.

evasnow1992 · 2026-06-01T17:08:13Z

+            samples = [bench_rdkit_batch(payloads, rdkit_max_seconds) for _ in range(3)]
+            samples.sort(key=lambda pair: pair[0] / max(pair[1], 1))
+            rdkit_time_s, rdkit_done = samples[len(samples) // 2]
+            rdkit_std_s = statistics.stdev([elapsed for elapsed, _ in samples]) if len(samples) > 1 else 0.0


Looks like there may be an inconsistency between the median and std calculations. Since pair[0] is normalized by pair[1] when selecting the median (rdkit_time_s), that value appears to represent per-molecule time. However, rdkit_std_s is still computed directly from the raw elapsed times?

scal444 added 7 commits May 29, 2026 14:24

conformer_rmsd: trim docstrings to drop cross-bench rationale and nar…

390f9ff

…ration

conformer_rmsd: reflow long argparse calls onto single lines

a80a9d1

Cosmetic only; matches the formatting style used in adjacent benches.

Fix some conf benchmark things

5ba61fc

Remove some stuff

786d47e

Fix minor things

34a284b

Add validation back in

3700a3c

scal444 requested a review from evasnow1992 June 1, 2026 13:14

evasnow1992 reviewed Jun 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use throughput data for Conformer RMSD benchmark#187

Use throughput data for Conformer RMSD benchmark#187
scal444 wants to merge 7 commits into
NVIDIA-BioNeMo:mainfrom
scal444:split/conformer-rmsd-batch

scal444 commented Jun 1, 2026

Uh oh!

evasnow1992 Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

scal444 commented Jun 1, 2026

Uh oh!

evasnow1992 Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants