Skip to content

Fix: edge_feat_embedding crashes on empty edge_index when normalized_features=True*#50

Open
atharrva01 wants to merge 2 commits intoDevoLearn:mainfrom
atharrva01:fix/edge-feat-empty-edge-index
Open

Fix: edge_feat_embedding crashes on empty edge_index when normalized_features=True*#50
atharrva01 wants to merge 2 commits intoDevoLearn:mainfrom
atharrva01:fix/edge-feat-empty-edge-index

Conversation

@atharrva01
Copy link
Contributor

What I found broken

Commit 6562728 fixed create_graph to produce torch.empty((2, 0)) for edge-free windows but edge_feat_embedding was never updated to handle that. With normalized_features=True (the default), I kept hitting this:

ValueError: Found array with 0 sample(s) (shape=(0, 7))
while a minimum of 1 is required by MinMaxScaler.

The scaler blows up on a zero-row array and the entire process() call aborts. One sparse window is enough to kill the whole dataset build.


Why I think this matters

Biologically sparse windows - early embryonic stages, filtered subregions, transient lineage gaps, are completely normal in developmental datasets. This isn't some weird edge case, it's expected data. The prior fix gave the impression empty edges were handled end-to-end, but the invariant broke right at the very next function call. On top of that, a partial process() run leaves a corrupt .pt file on disk, which quietly wrecks experiment reproducibility.


What I changed

I added a 2-line early return at the top of edge_feat_embedding:

def edge_feat_embedding(self, x, edge_index):
    if edge_index.shape[1] == 0:
        return np.empty((0, x.shape[1]), dtype=np.float32)
    # ... rest unchanged

torch.FloatTensor handles the empty (0, F) array fine, the nan check passes cleanly on an empty tensor, and Data(edge_feat=...) accepts it without issue. Nothing changes for non-empty graphs , I made sure of that.

Guard against zero-edge windows before calling normalize_array,
which fails with ValueError when fit on 0-sample arrays.

Signed-off-by: atharrva01 <atharvaborade568@gmail.com>
@atharrva01
Copy link
Contributor Author

hi @devoworm this pr fixed a crash in edge_feat_embedding where I hit a ValueError from MinMaxScaler on empty edge_index when normalized_features=True

Signed-off-by: atharrva01 <atharvaborade568@gmail.com>
@atharrva01
Copy link
Contributor Author

image

Added unit tests for edge_feat_embedding in test/test_datasets1.py , one for the empty edge_index crash case (regression guard for the fix) and one for the normal non-empty path to make sure I didn't break anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant