Commit 14e2ed7
committed
Derive other indexes directly for binary fuse
We manipulate the math and use bit tricks to derive the other two
indexes more efficiently during peeling.
Apple M1:
```
name old MKeys/s new MKeys/s delta
BinaryFusePopulate/8/n=10000-10 43.8 ± 2% 50.3 ± 3% +14.88% (p=0.000 n=8+9)
BinaryFusePopulate/8/n=100000-10 38.6 ± 3% 41.3 ± 1% +7.09% (p=0.000 n=9+8)
BinaryFusePopulate/8/n=1000000-10 35.0 ± 4% 36.5 ± 7% +4.12% (p=0.013 n=9+10)
BinaryFusePopulate/16/n=10000-10 48.6 ± 4% 48.5 ± 6% ~ (p=1.000 n=10+10)
BinaryFusePopulate/16/n=100000-10 38.0 ± 3% 41.1 ± 1% +8.35% (p=0.000 n=10+10)
BinaryFusePopulate/16/n=1000000-10 33.8 ± 5% 36.6 ± 2% +8.14% (p=0.000 n=10+10)
```
GCE N4D (AMD Turin):
```
name old MKeys/s new MKeys/s delta
BinaryFusePopulate/8/n=10000-8 53.2 ± 3% 57.1 ± 1% +7.46% (p=0.000 n=10+10)
BinaryFusePopulate/8/n=100000-8 33.0 ± 0% 37.5 ± 1% +13.38% (p=0.000 n=10+10)
BinaryFusePopulate/8/n=1000000-8 28.5 ± 2% 31.8 ± 2% +11.59% (p=0.000 n=10+10)
BinaryFusePopulate/16/n=10000-8 53.1 ± 1% 56.2 ± 1% +5.93% (p=0.000 n=10+10)
BinaryFusePopulate/16/n=100000-8 31.8 ± 1% 37.3 ± 1% +17.35% (p=0.000 n=10+10)
BinaryFusePopulate/16/n=1000000-8 27.5 ± 1% 30.9 ± 1% +12.34% (p=0.000 n=10+10)
```
GCE C4 (Intel Emerald Rapids, turbo boost capped at "all core" max):
```
name old MKeys/s new MKeys/s delta
BinaryFusePopulate/8/n=10000-8 29.2 ± 1% 32.2 ± 1% +10.00% (p=0.000 n=10+10)
BinaryFusePopulate/8/n=100000-8 27.0 ± 3% 29.8 ± 5% +10.22% (p=0.000 n=10+10)
BinaryFusePopulate/8/n=1000000-8 25.6 ± 3% 28.2 ± 5% +10.27% (p=0.000 n=10+10)
BinaryFusePopulate/16/n=10000-8 28.9 ± 1% 32.0 ± 1% +10.84% (p=0.000 n=10+10)
BinaryFusePopulate/16/n=100000-8 26.2 ± 1% 28.8 ± 3% +10.05% (p=0.000 n=10+10)
BinaryFusePopulate/16/n=1000000-8 24.8 ± 2% 26.9 ± 2% +8.37% (p=0.000 n=10+10)
```
GCE C4A (Google's Axion ARM64):
```
name old MKeys/s new MKeys/s delta
BinaryFusePopulate/8/n=10000-8 45.1 ± 1% 45.1 ± 1% ~ (p=0.511 n=9+10)
BinaryFusePopulate/8/n=100000-8 39.8 ± 1% 39.4 ± 1% -0.79% (p=0.018 n=9+10)
BinaryFusePopulate/8/n=1000000-8 33.9 ± 3% 34.2 ± 3% ~ (p=0.363 n=10+10)
BinaryFusePopulate/16/n=10000-8 44.0 ± 1% 44.7 ± 1% +1.54% (p=0.000 n=9+10)
BinaryFusePopulate/16/n=100000-8 37.4 ± 1% 38.4 ± 1% +2.75% (p=0.000 n=10+10)
BinaryFusePopulate/16/n=1000000-8 30.9 ± 5% 32.4 ± 1% +4.84% (p=0.000 n=10+10)
```1 parent e8256d3 commit 14e2ed7
1 file changed
Lines changed: 57 additions & 22 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
63 | 72 | | |
64 | 73 | | |
65 | 74 | | |
| |||
79 | 88 | | |
80 | 89 | | |
81 | 90 | | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | 91 | | |
87 | 92 | | |
88 | 93 | | |
| |||
194 | 199 | | |
195 | 200 | | |
196 | 201 | | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
197 | 205 | | |
198 | 206 | | |
199 | 207 | | |
| |||
204 | 212 | | |
205 | 213 | | |
206 | 214 | | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
213 | 257 | | |
214 | | - | |
215 | 258 | | |
216 | 259 | | |
217 | 260 | | |
218 | 261 | | |
219 | 262 | | |
220 | | - | |
| 263 | + | |
221 | 264 | | |
222 | 265 | | |
223 | | - | |
224 | 266 | | |
225 | 267 | | |
226 | 268 | | |
227 | 269 | | |
228 | 270 | | |
229 | | - | |
| 271 | + | |
230 | 272 | | |
231 | 273 | | |
232 | 274 | | |
| |||
255 | 297 | | |
256 | 298 | | |
257 | 299 | | |
| 300 | + | |
258 | 301 | | |
259 | 302 | | |
260 | 303 | | |
| |||
297 | 340 | | |
298 | 341 | | |
299 | 342 | | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | | - | |
308 | 343 | | |
309 | 344 | | |
310 | 345 | | |
| |||
0 commit comments