Add whitening_seed parameter to MountainSort5#4600
Merged
Conversation
MountainSort5 whitens the recording before sorting, and the whitening matrix is estimated from randomly-selected data chunks (spikeinterface.preprocessing.whiten -> get_random_data_chunks). Without control of the seed, sorting isn't reproducible, even on identical input. Added a `whitening_seed` parameter (default None, preserves current behavior) that is forwarded to whiten(seed=...). There is still some very small residual indeterminism (on my test recording: 100% of units match, but ~0.04% of spikes differ) between runs with identical input. Not sure where it comes from -- maybe BLAS multithreading?
Member
|
Looks good, thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
MountainSort5 whitens the recording before sorting, and the whitening matrix is estimated from randomly-selected data chunks (
spikeinterface.preprocessing.whiten->get_random_data_chunks).Without control of the seed, sorting isn't reproducible, even on identical input.
Added a
whitening_seedparameter (defaultNone, preserves current behavior) that is forwarded towhiten(seed=...).There is still some very, very small residual indeterminism (on my test recording: 100% of units match, but ~0.04% of spikes differ) between runs with identical input. Not sure where it comes from -- maybe BLAS multithreading? Not worried about it.
Looks okay @magland ?
(The reason I'm trying to get deterministic output is so that I can evaluate whether the lossy
WavPack(bps=2.25)compression that has negligible impact on Neuropixel sortings is also safe for some tetrode recordings. @alejoe91 )