Add ResizeBuffer: automatically growing buffer#54
Add ResizeBuffer: automatically growing buffer#54MasonProtter merged 6 commits intoMasonProtter:mainfrom
ResizeBuffer: automatically growing buffer#54Conversation
dc29151 to
b6b39ba
Compare
| """ | ||
| ResizeBuffer{StorageType} | ||
|
|
||
| This is a simple bump allocator that could be used to store a fixed amount of memory of type | ||
| `StorageType`, so long as `::StorageType` supports `pointer`, and `sizeof`. | ||
|
|
||
| Do not manually manipulate the fields of a `ResizeBuffer` that is in use. | ||
| """ |
There was a problem hiding this comment.
This isn't a very accurate description right? This thing allows overflow, and will adaptively resize.
There was a problem hiding this comment.
Sorry, I still have to actually fix the docs, this is mostly just copied from the AllocBuffer. Will address this though!
There was a problem hiding this comment.
Updated this now, I hope this is a more accurate description
| buf::Ptr{Cvoid} | ||
| buf_len::UInt | ||
|
|
||
| offset::UInt | ||
| max_offset::UInt | ||
|
|
||
| overflow::Vector{Ptr{Cvoid}} | ||
|
|
||
| function ResizeBuffer(max_size::Int = default_max_size; finalize::Bool = true) | ||
| buf = malloc(max_size) | ||
| buf_len = max_size | ||
| overflow = Ptr{Cvoid}[] | ||
| resizebuf = new(buf, buf_len, UInt(0), UInt(0), overflow) | ||
| finalize && finalizer(free, resizebuf) | ||
| return resizebuf |
There was a problem hiding this comment.
Is there a reason to use malloc/free here rather than just storing a Vector{Memory{UInt8}} and then taking the pointers from those Memorys?
I'm okay with using malloc/free here but just curious how conscious a decision it was.
There was a problem hiding this comment.
I have to admit that it is not that conscious of a decision. I guess I was trying to follow the style of the SlabBuffer, also because I'm not entirely sure about what is and isn't allowed in the scope of this package (I seem to vaguely recall that at some point not using the Julia GC was important for static compilation etc, but might be wrong here?)
Thinking about it more properly though, I would argue that it might be beneficial to keep it as malloc and free, since we already have the infrastructure of escape analysis etc anyways it might be nice to avoid possibly triggering the Julia GC, for example in multithreaded environments? This is however just an idea, I have no data to back this up or to substantiate if this is relevant at all.
|
I updated the implementation a bit and did some preliminary testing. Benchmark Setupusing Bumper
using BenchmarkTools
function f(x, buf, n_inner = 10)
ctr = 0
@no_escape buf begin
for i_inner in 1:n_inner
y = @alloc(Int, length(x))
y .= x .+ 1
ctr += sum(y)
end
end
return ctr
end
function run_benchmark(n_inner = 100, sz = 10)
return @benchmark f(x, buf, $n_inner) setup = begin
x = rand(1:10, $sz)
buf = ResizeBuffer(0)
end seconds = 20 samples = 100_000 evals = 50
end
|
|
Friendly reminder on this PR :) (no rush from my side by the way, just trying to cross things from my to-do list. I have a tendency of forgetting these kinds of side-projects myself, and typically appreciate the remidners, but I'm happy to not bother you if this is annoying rather than helpful) |
|
Friendly ping here again 😇 |
MasonProtter
left a comment
There was a problem hiding this comment.
Hey, really sorry for the delay, I meant to respond way before and then got distracted. This looks great! I think we need to add something to the README docs, but I can do that myself if you like
This is the port of some work I did for TensorOperations.jl.
The idea is that in a workflow that repeatedly does the same task that might allocate large objects, but where it is a priori difficult to foresee how much memory is needed, this approach attempts to strike a balance between user-friendly, performance, and number of allocations.
The approach is simple, we keep a buffer alive like
AllocBuffer, but instead of erroring whenever we oversubscribe the buffer, we manually allocate extra objects (more like theSlabBuffer).Additionally, we keep a counter that simply keeps track of the maximal size that would have been reached if the buffer were sufficiently large, and after resetting the buffer, the next allocation will trigger a resize to account for that.
Note that this still needs some work, but could already do with a review.
warning that the tests have been AI generated and I still have to find the time to review that more closely myself!
Additionally I still have to have a look at the docs and make sure these are updated as well.