Conversation
36a6296 to
4ecee01
Compare
|
So I considered adding this API when I first added the arena, but honestly I could not get it to make a big enough difference to justify it. With the approach you have in the PR I don't think it will help because the last block is not going to be big enough. If you wanted to the take approach of count up the used memory, then if the used memory fits into the last block then free everything except the last block and return, otherwise free everything and allocate a block with the sum of the previous blocks, I found that approach yields about a 3% speed win on parsing everything in ruby/ruby while retaining more memory. But I'm not sure if it's worth it. |
That's helpful context, thank you.
Ah, I see, you're not doubling the size of the blocks on every allocation, but every N allocations or so. OK, yeah, you'd have to do a block reallocation strategy here. I'd be curious whether a straight doubling strategy (or even multiplying by 3/2) would be better overall, with or without this PR. In any event, I'm happy to close this. |
This PR is half commit-as-is, half starting point for discussion.
We have some Rust analysis programs that parse a bunch of Ruby files in one go. The way things are currently structured, the arena used for one file and parse needs to be completely freed, and the next file needs to start reallocating blocks out of a completely fresh arena. These new blocks may or may not be the same blocks, and if the files are sufficiently large, we need to repeatedly allocate multiple blocks for each file. I think the same thing is true for something like Sorbet parsing a bunch of files in one go. But
pm_arena_tdoesn't offer any API functions that would let you reset the state of the allocator.It seems like it'd be useful to try and re-use arenas between parsing calls: we could immediately start reusing the biggest chunk of memory allocated thus far, and we don't need to waste time repeatedly allocating blocks of memory and freeing them. Hence this PR to add
pm_arena_resetto almost, but not quite, completely reset the arena so it can be reused.I know of at least one precedent for this kind of API: the
bumpaloRust crate offersBump#resetwhich does exactly the same thing offered here. GNU's obstacks are not quite the same thing as arena allocation, but they do allow you to reuse an obstack in roughly the same way. I imagine there are other instances, but those are the two I know about.Full disclosure: I have no stats backing it up that this would actually be faster or be so much faster that it's worthwhile having the extra code in Prism.
This PR would either require followon PRs to figure out what the Rust-side API would look like (e.g. do we need/want to
PRISM_EXPORTED_FUNCTIONpm_arena_reset? how does theruby_prism::parsemethod need to change, or what does an additional API look like? etc.), or would require some iteration, assuming that it's accepted in the first place.