Use random-shuffle list shuffling instead of random >= 1.3 by jisantuc · Pull Request #169 · DataHaskell/dataframe

jisantuc · 2026-02-27T01:49:44Z

Overview

This PR downgrades random from >= 1.3 to between 1.2 and 1.3. random 1.3+ plays poorly with some packages in the ecosystem.
Instead of uniformShuffleListM, it uses shuffle' from random-shuffle, which was already a test dependency.

jisantuc · 2026-02-27T01:55:40Z

Per the contributing guidelines, I tried to add a label (I assumed that was what "A tag (usually feat, documentation, refactor etc)" meant?) but it seems like I'm not allowed to do that 🤔

jisantuc · 2026-02-27T01:56:40Z

src/DataFrame/Operations/Permutation.hs


 shuffledIndices :: (RandomGen g) => g -> Int -> VU.Vector Int
-shuffledIndices pureGen k = VU.fromList (fst (uniformShuffleList [0 .. (k - 1)] pureGen))
+shuffledIndices pureGen k = VU.fromList (shuffle' [0 .. (k - 1)] k pureGen)


Pretty unclear to me how to test this. I thought about a shuffle/un-shuffle identity test, but unshuffle didn't get me very far in search results 😅

Any thoughts?

There are a couple of things I would test then.

test that the shuffling doesn't do anything else than shuffle. So basically sort the shuffled and unshuffled and see if it's equal to the same thing. This ensures that shuffling isn't doing anything else than permuting the indices.

check that shuffling with equivalent seeds result in the same shuffle.

That's about all I can think of

Yeah. No need to round trip. Checking that shuffling preserves length (even when there are duplicates) is probably important. Plus that different seeds are different shuffle orders.

Also on second thought the intermediate list allocation is wasteful. I'll add it as a GSOC task to implement fisher yates here.

that task shouldn't be GSOC, it should be anyone! Also mwc-random does that I think.

Why not? It seems simple and self contained enough since it's reading the algorithm and implementing it through.

there could be some task pipeline for people who are not interested only in GSOC, but more interested generally in contributing!

jisantuc · 2026-02-27T02:27:51Z

I used this branch's package in the plot survey branch I started with no problems other than needing to roll my own list generator with replicateM and state jisantuc/goofing-off@7c9b310#diff-206b9ce276ab5971a2489d75eb1b12999d4bf3843b7988cbe8d687cfde61dea0R4

mchav · 2026-02-27T03:55:01Z

@jisantuc i think there's an issue with the test module name. Once that's fixed this will be good to go.

jisantuc · 2026-02-27T04:18:27Z

@mchav yeah I forgot to rename the module after I understood the test naming convention a little better. Fixed now though

jisantuc · 2026-02-27T05:30:15Z

Per the contributing guidelines, I tried to add a label (I assumed that was what "A tag (usually feat, documentation, refactor etc)" meant?) but it seems like I'm not allowed to do that 🤔

🤦🏻 @mchav I just noticed other commits on main -- you meant a tag in the commit header itself, like in the conventional commits style?

Ai-Ya-Ya · 2026-02-27T13:05:45Z

@mchav What's your timeline on the next Hackage release? nixpkgs CI should pull from there, the hope being it'll be unbroken.

mchav · 2026-02-27T22:17:27Z

@Ai-Ya-Ya released: https://hackage.haskell.org/package/dataframe-0.5.0.0

Use random-shuffle list shuffling instead of random >= 1.3

b127411

jisantuc commented Feb 27, 2026

View reviewed changes

jisantuc added 2 commits February 26, 2026 18:59

Now with tests

7f57b1f

module rename, woops

184ae00

jisantuc added 3 commits February 26, 2026 20:46

Add new test preserving column names

7f53743

Fix sort order test

f8fdae4

Rename shuffleOnlyShuffles

b1def3d

mchav merged commit 03e34ab into DataHaskell:main Feb 27, 2026
7 checks passed

jisantuc deleted the maint/js/downgrade-random branch February 27, 2026 05:30

Conversation

jisantuc commented Feb 27, 2026

Overview

Uh oh!

jisantuc commented Feb 27, 2026

Uh oh!

jisantuc Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

daikonradish Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

mchav Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

mchav Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

daikonradish Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

mchav Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

daikonradish Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

jisantuc commented Feb 27, 2026

Uh oh!

mchav commented Feb 27, 2026

Uh oh!

jisantuc commented Feb 27, 2026

Uh oh!

Uh oh!

jisantuc commented Feb 27, 2026

Uh oh!

Ai-Ya-Ya commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mchav commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Ai-Ya-Ya commented Feb 27, 2026 •

edited

Loading