reintroduce vectorcall optimization with new PyCallArgs trait#4768
reintroduce vectorcall optimization with new PyCallArgs trait#4768
vectorcall optimization with new PyCallArgs trait#4768Conversation
CodSpeed Performance ReportMerging #4768 will improve performances by 57.92%Comparing Summary
Benchmarks breakdown
|
|
Looks like it worked! If we're roughly happy with this, I'll work on the docs. |
|
It looks like we can even enable some of it on the limited api after 3.12: https://docs.python.org/3/c-api/call.html#c.PyObject_Vectorcall, but I don't think we have the ffi definitions yet. |
davidhewitt
left a comment
There was a problem hiding this comment.
Awesome! Always so satisfying to see such big perf wins. 😂
I actually have the FFI bindings in #4379 but I haven't completed splitting that PR up.
|
|
||
| /// TODO | ||
| pub trait PyCallArgs<'py>: Sized + private::Sealed { | ||
| #[doc(hidden)] |
There was a problem hiding this comment.
I think it's a good idea to have all the methods hidden and the type sealed; we can explain the situation with docs 👍.
Especially if in the long term we might have something like #4414 to solve keyword args (although much design experimentation still needed to make sure we're all happy with whatever we come up with).
There was a problem hiding this comment.
I took a first pass on the docs, let we know what you think.
86bbe39 to
0b4a277
Compare
davidhewitt
left a comment
There was a problem hiding this comment.
I think this looks super, thanks. Sorry for the long delays, I am beginning to pick off the long backlog now 🫣
|
No worries 😄, and happy holidays🎄 |
|
Thanks, happy holidays to you too! 🎉 🎄 |
This reintroduces the
vectorcalloptimization we removed temporarily in #4653 using a newPyCallArgstrait as was discussed during the initial implementation in #4456.This slightly reduces the number of types that can be passed to
PyAnyMethods::calland friends. With this only Rust tuples (including unit),Bound<PyTuple>andPy<PyTuple>are allowed (instead of any type that can be converted into aPyTupleviaIntoPyObjectas before). Insidepyo3there is no such type, but there could be user types (for example#[derive(IntoPyObject)]tuple structs) that won't work anymore. The work around is to simply convert at the call site, so I don't think that's a huge blocker.There are still possibilities left open with regards to
kwargs. One idea could be aPyCallArgstype with a builder like API to set both, positional and keyword args which can than be transformed as needed depending on the calling convention (maybe with something likesmallvecor evenheapless::Vecto avoid lots of small allocation). I might explore that as a followup.Closes #4656