Skip to content

Lmdb test fix#876

Open
Aske-Rosted wants to merge 2 commits intographnet-team:mainfrom
Aske-Rosted:lmdb_test_fix
Open

Lmdb test fix#876
Aske-Rosted wants to merge 2 commits intographnet-team:mainfrom
Aske-Rosted:lmdb_test_fix

Conversation

@Aske-Rosted
Copy link
Copy Markdown
Collaborator

The current checks are running into an error, the inital merge of #852 did not seem to have the test fail but all subsequent PR's have the test failing. (I also reran the test for #852 and the test seems to fail, it is beyond me how it passed initially...)

The error comes from opening multiple connections of the same LMDB database which does not seem to be allowed. This runs into an error at

env = lmdb.open(lmdb_path, readonly=True, lock=False, subdir=True)
which is caught and None is returned for
def get_serialization_method_name(lmdb_path: str) -> Optional[str]:
. My impression is that
def _close_connection(self) -> "LMDBDataset":
is properly handled by the dataloaders and why this has not shown up as an issue in real use.

The fix I propose is to add a public close() (which just calls the private _close_connection(), this is more convention than anything else) and then call the close() in the test before testing the precomputed dataset version.

Another minor adjustment i made is move the close_connection() call to the end of the post_init() call, this is just a small refactoring.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes failing LMDB-related CI tests by ensuring LMDB database connections are explicitly closed between test phases, and by slightly refactoring LMDBDataset’s post-init connection handling.

Changes:

  • Add a public LMDBDataset.close() method that closes the underlying LMDB environment.
  • Update the LMDB converter/dataset test to close LMDB datasets before re-opening the same database for the precomputed-representation checks.
  • Move the _close_connection() call to the end of LMDBDataset._post_init().

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
tests/data/test_dataconverters_and_datasets.py Explicitly closes LMDB datasets during the LMDB converter test to avoid multiple concurrent opens of the same LMDB path.
src/graphnet/data/dataset/lmdb/lmdb_dataset.py Refactors post-init connection closing order and introduces a public close() method.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 299 to 313
@@ -310,6 +309,7 @@
dataset_from_lmdb_raw[ix].x, dataset_sqlite[ix].x
)

dataset_from_lmdb_raw.close() # Close connection
# Test 2: Check that pre-computed representation matches real-time computed
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataset_from_lmdb_raw.close() is only reached if all assertions above pass. If an assertion fails, the LMDB environment can remain open and may cause later LMDB tests in the same session to error when creating a new dataset for the same path. Consider wrapping the dataset usage in a try/finally (or using contextlib.closing) so close() is always executed.

Copilot uses AI. Check for mistakes.
Comment on lines 324 to +364
dataset_from_lmdb_precomputed = LMDBDataset(
path, **opt_precomputed # type: ignore
)

assert len(dataset_from_lmdb_precomputed) == len(dataset_sqlite)

# Compare pre-computed representation with real-time computed from SQLite
for ix in range(len(dataset_sqlite)):
precomputed_event = dataset_from_lmdb_precomputed[ix]
realtime_event = dataset_sqlite[ix]

# Compare node features
assert torch.allclose(precomputed_event.x, realtime_event.x)

# Compare edge indices if they exist
if hasattr(precomputed_event, "edge_index") and hasattr(
realtime_event, "edge_index"
):
assert torch.equal(
precomputed_event.edge_index, realtime_event.edge_index
)

# Compare number of pulses
assert precomputed_event.n_pulses == realtime_event.n_pulses

# Compare truth labels
for truth_key in TRUTH.DEEPCORE:
if hasattr(precomputed_event, truth_key) and hasattr(
realtime_event, truth_key
):
precomputed_truth = getattr(precomputed_event, truth_key)
realtime_truth = getattr(realtime_event, truth_key)
if isinstance(precomputed_truth, torch.Tensor):

assert torch.allclose(
precomputed_truth,
realtime_truth.to(precomputed_truth.dtype),
)
else:
assert precomputed_truth == realtime_truth
dataset_from_lmdb_precomputed.close() # Close connection
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataset_from_lmdb_precomputed.close() is not guaranteed to run if any of the comparisons/assertions in the loop fail. To avoid leaking an open LMDB environment (which can make subsequent tests flaky), wrap the dataset lifetime in a try/finally (or contextlib.closing) so the connection is always closed.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants