Skip to content

savoirtech/java-24-vector-api

Repository files navigation

Performance testing Java 24’s Vector API.

Test Harness:

JavaVectorAPI-1: https://github.com/tomerr90/JavaVectorAPI-1.git

Test Design:

Using JDK 24 we will run all five test suites from JavaVectorAPI-1 on a different CPU vendor:

  • Intel Xeon E-2378

  • AMD Ryzen 9 9950x

  • Apple M4 Base

  • Qualcomm SnapDragon X Elite X1E80100

Test Results:

Intel Xeon E-2378

SimpleSum:

SimpleSum

SimpleSumNoSuperWord:

SimpleSumNoSuperWord

ComplexExpression:

ComplexExpression

ComplexExpressionNoSuperWord:

ComplexExpressionNoSuperWord

ArrayStats:

ArrayStats

AMD Ryzen 9 9950x

SimpleSum:

SimpleSum

SimpleSumNoSuperWord:

SimpleSumNoSuperWord

ComplexExpression:

ComplexExpression

ComplexExpressionNoSuperWord:

ComplexExpressionNoSuperWord

ArrayStats:

ArrayStats

Apple M4 Base

SimpleSum:

SimpleSum

SimpleSumNoSuperWord:

SimpleSumNoSuperWord

ComplexExpression:

ComplexExpression

ComplexExpressionNoSuperWord:

ComplexExpressionNoSuperWord

ArrayStats:

ArrayStats

Qualcomm SnapDragon X Elite X1E80100

SimpleSum:

SimpleSum

SimpleSumNoSuperWord:

SimpleSumNoSuperWord

ComplexExpression:

ComplexExpression

ComplexExpressionNoSuperWord:

ComplexExpressionNoSuperWord

ArrayStats:

ArrayStats

Test Analysis:

SimpleSum/SimpleSunNoSuperWord & ComplexExpression/ComplexExpressionNoSuperWord:

JIT is in most cases providing automatic vectorization matching find tuned implementation.

ArrayStats - Branchless code:

There are a few stories that fall out of this test suite.

x64:

There is a band of data set size that will elicit optimal boost from the Vector API.

Intel’s band is wider than AMD, however when AMD is processing optimal data set size, its gaining a very large boost.

ARM:

Both Qualcomm and Apple tend to have more boost as datasets increase in size. In Qualcomm’s case the larger boost starts at a smaller data size, however as data set size grows the Apple system achieves more boost.

Conclusion:

For these specific processors we see that the Ryzen CPU can generate the most performance boost using Vector API IFF the data set size is in its goldilocks zone. Both ARM CPUs appear to gain more boost from Vector APIs as data sets get larger.

About

Performance testing Java 24's Vector API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors