Describe the enhancement requested
With archery benchmark [run|diff] --preserve, the build and source folders are preserved into a folder like the following. In my personal use, I found several improvements that would ease working with archery benchmark.
<TMP>/arrow-archery-xlzqaz4l/<GIT STR>/
- arrow/
- build/
A - Set the preserve directory
On MacOS, the directory ends up being something like this
/var/folders/9c/thrhbgqx2xb2xqvfk_2m6pgh0000gn/T/
Which is impossible to find again without parsing archery log.
On top of this, I'd sometime want to control more finely where the cache is stored, either for convenience of inspecting/using it, or because the path structure is not satisfying for some use case (e.g. baenchmarking same commit but with xsimd 14.1 and 14.2, with different compiler options...).
I propose adding an optional CLI argument --preserve-dir <PATH> to explicitly control where the preserve directory are stored (<TMP> in the above).
B - Always preserve benchmark output with preserve option
When --preserve is set, I recommend we always store the benchmark timings JSON file in the preserve directory:
- It is relatively small compared the the size of the build directory, so we should be eager in saving it just in case it might be needed
- It helps keeping track of their name. Currently we have to think of an explicit name in the
--output. Having a copy automatically with the build, it is associated with the commit name, the path of the preserve-dir, and we can retract the compilation context from the build directory (compiler flags used...).
This greatly reduce the cognitive load of having to choose name, track which file correspond to which settings, reduce the length of the archery benchmark commands we type.
This would be independent from --output, which would still work as before.
There is also more information we should store, such as the invocation command.
C - Resolve git string (breaking)
Right now, with archery benchmark run main the path created is:
<TMP>/arrow-archery-xlzqaz4l/main/
I suggest replacing it automatically with
<TMP>/arrow-archery-xlzqaz4l/<GIT SHA>/
At first glance, it will be slightly harder looking at the folder that main was intended. Though in practice beleive this is more sneaky than helpful. main is a moving target and even with a day of work its meaning can change and we forget "which main". This is even more error-prone with a feature branch, and remote copies for bench-marking different platforms.
Component(s)
Archery
Describe the enhancement requested
With
archery benchmark [run|diff] --preserve, the build and source folders are preserved into a folder like the following. In my personal use, I found several improvements that would ease working witharchery benchmark.A - Set the preserve directory
On MacOS, the directory ends up being something like this
Which is impossible to find again without parsing
archerylog.On top of this, I'd sometime want to control more finely where the cache is stored, either for convenience of inspecting/using it, or because the path structure is not satisfying for some use case (e.g. baenchmarking same commit but with xsimd 14.1 and 14.2, with different compiler options...).
I propose adding an optional CLI argument
--preserve-dir <PATH>to explicitly control where the preserve directory are stored (<TMP>in the above).B - Always preserve benchmark output with preserve option
When
--preserveis set, I recommend we always store the benchmark timings JSON file in the preserve directory:--output. Having a copy automatically with the build, it is associated with the commit name, the path of thepreserve-dir, and we can retract the compilation context from thebuilddirectory (compiler flags used...).This greatly reduce the cognitive load of having to choose name, track which file correspond to which settings, reduce the length of the
archery benchmarkcommands we type.This would be independent from
--output, which would still work as before.There is also more information we should store, such as the invocation command.
C - Resolve git string (breaking)
Right now, with
archery benchmark run mainthe path created is:I suggest replacing it automatically with
At first glance, it will be slightly harder looking at the folder that
mainwas intended. Though in practice beleive this is more sneaky than helpful.mainis a moving target and even with a day of work its meaning can change and we forget "which main". This is even more error-prone with a feature branch, and remote copies for bench-marking different platforms.Component(s)
Archery