Skip to content

Commit 87c70b8

Browse files
authored
Refactor CPU feature detection: split x86 and ARM into separate units with shared SIMD level types (#61)
## Summary Splits the monolithic `TCpuFeatures` class into architecture-specific units (`HlpX86SimdFeatures`, `HlpArmSimdFeatures`) with shared SIMD level enums (`HlpSimdLevels`), and adds foundational ARM SIMD and target OS defines to the include files. ## Motivation `HlpCpuFeatures` was a single class that mixed x86-specific CPUID detection logic with the public dispatch API. This made it impossible to extend to ARM without further bloating the unit. The `TCpuSimdLevel` enum was also x86-specific (`SSE2`, `SSSE3`, `AVX2`) but lived in the shared namespace, leaving no room for ARM SIMD levels. All dispatch units (`HlpBlake2BDispatch`, `HlpSHA2_256Dispatch`, etc.) called `TCpuFeatures.GetActiveLevel()` and matched against `TCpuSimdLevel.*` — semantically x86 concepts that were presented as architecture-neutral. ## Changes ### New units - **`HlpSimdLevels`** — defines `TX86SimdLevel` (Scalar, SSE2, SSSE3, AVX2) and `TArmSimdLevel` (Scalar, NEON, SVE, SVE2) as separate enums. - **`HlpX86SimdFeatures`** — `TX86SimdFeatures` class containing all CPUID/XGETBV inline assembly, hardware probing, build-time override logic, and cached feature flags. Moved from `HlpCpuFeatures` with the addition of `HasAESNI()` detection (CPUID leaf 1, ECX bit 25). - **`HlpArmSimdFeatures`** — `TArmSimdFeatures` class with stub detection methods for NEON, SVE, SVE2, and crypto extensions (AES, SHA1, SHA256, SHA512, SHA3, PMULL). All detection methods currently return `False` (marked `TODO`), providing the scaffolding for future ARM SIMD dispatch. ### Refactored `HlpCpuFeatures` `TCpuFeatures` is now a thin facade with two class properties: - `TCpuFeatures.X86` → returns `TX86SimdFeatures` - `TCpuFeatures.Arm` → returns `TArmSimdFeatures` All CPUID logic, class vars, and the `initialization` section have been removed from this unit. Detection now runs in each architecture-specific unit's own `initialization` block. ### Dispatch unit updates All 12 dispatch units updated to use the new API: - `TCpuFeatures.GetActiveLevel()` → `TCpuFeatures.X86.GetSimdLevel()` - `TCpuSimdLevel.*` → `TX86SimdLevel.*` - `TCpuFeatures.HasSHANI()` → `TCpuFeatures.X86.HasSHANI()` - `TCpuFeatures.HasPCLMULQDQ()` → `TCpuFeatures.X86.HasPCLMULQDQ()` - `TCpuFeatures.HasVPCLMULQDQ()` → `TCpuFeatures.X86.HasVPCLMULQDQ()` - Added `HlpSimdLevels` to each dispatch unit's `uses` clause. Affected dispatch units: Adler32, CRC, Blake2B, Blake2S, Blake3, SHA1, SHA2-256, SHA2-512, SHA3, XXHash3, Argon2, Scrypt. ### Include file additions **`HashLib.inc` (Delphi)**: - Added `HASHLIB_ARM` and `HASHLIB_AARCH64` CPU architecture defines. - Added target OS defines: `HASHLIB_MSWINDOWS`, `HASHLIB_IOS`, `HASHLIB_MACOS`, `HASHLIB_ANDROID`, `HASHLIB_LINUX`. - Added `HASHLIB_ARM_SIMD` composite define (mirrors existing `HASHLIB_X86_SIMD`). - Added ARM force-dispatch options (`HASHLIB_FORCE_NEON`, `HASHLIB_FORCE_SVE`) with mutual exclusion compile-time check. **`HashLibFPC.inc` (FPC)**: - Added `HASHLIB_ARM` / `HASHLIB_ARM_ASM` and `HASHLIB_AARCH64` / `HASHLIB_AARCH64_ASM` defines. - Added target OS defines: `HASHLIB_MSWINDOWS`, `HASHLIB_ANDROID`, `HASHLIB_IOS`, `HASHLIB_MACOS`, `HASHLIB_BSD`, `HASHLIB_LINUX`, `HASHLIB_SOLARIS`.
1 parent 55e10d5 commit 87c70b8

24 files changed

Lines changed: 731 additions & 302 deletions

HashLib.Benchmark/Delphi/PerformanceBenchmarkConsole.dpr

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,9 @@ uses
125125
HlpBitConverter in '..\..\HashLib\src\Utils\HlpBitConverter.pas',
126126
HlpBits in '..\..\HashLib\src\Utils\HlpBits.pas',
127127
HlpCpuFeatures in '..\..\HashLib\src\Utils\HlpCpuFeatures.pas',
128+
HlpX86SimdFeatures in '..\..\HashLib\src\Utils\HlpX86SimdFeatures.pas',
129+
HlpArmSimdFeatures in '..\..\HashLib\src\Utils\HlpArmSimdFeatures.pas',
130+
HlpSimdLevels in '..\..\HashLib\src\Utils\HlpSimdLevels.pas',
128131
HlpHashLibTypes in '..\..\HashLib\src\Utils\HlpHashLibTypes.pas',
129132
HlpArrayUtils in '..\..\HashLib\src\Utils\HlpArrayUtils.pas';
130133

HashLib.Benchmark/Delphi/PerformanceBenchmarkFMX.dpr

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,9 @@ uses
124124
HlpBitConverter in '..\..\HashLib\src\Utils\HlpBitConverter.pas',
125125
HlpBits in '..\..\HashLib\src\Utils\HlpBits.pas',
126126
HlpCpuFeatures in '..\..\HashLib\src\Utils\HlpCpuFeatures.pas',
127+
HlpX86SimdFeatures in '..\..\HashLib\src\Utils\HlpX86SimdFeatures.pas',
128+
HlpArmSimdFeatures in '..\..\HashLib\src\Utils\HlpArmSimdFeatures.pas',
129+
HlpSimdLevels in '..\..\HashLib\src\Utils\HlpSimdLevels.pas',
127130
HlpHashLibTypes in '..\..\HashLib\src\Utils\HlpHashLibTypes.pas',
128131
HlpArrayUtils in '..\..\HashLib\src\Utils\HlpArrayUtils.pas';
129132

HashLib.Tests/Delphi.Tests/HashLib.Tests.dpr

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,9 @@ uses
143143
HlpBitConverter in '..\..\HashLib\src\Utils\HlpBitConverter.pas',
144144
HlpBits in '..\..\HashLib\src\Utils\HlpBits.pas',
145145
HlpCpuFeatures in '..\..\HashLib\src\Utils\HlpCpuFeatures.pas',
146+
HlpX86SimdFeatures in '..\..\HashLib\src\Utils\HlpX86SimdFeatures.pas',
147+
HlpArmSimdFeatures in '..\..\HashLib\src\Utils\HlpArmSimdFeatures.pas',
148+
HlpSimdLevels in '..\..\HashLib\src\Utils\HlpSimdLevels.pas',
146149
HlpHashLibTypes in '..\..\HashLib\src\Utils\HlpHashLibTypes.pas',
147150
HlpArrayUtils in '..\..\HashLib\src\Utils\HlpArrayUtils.pas',
148151
HashLibTestBase in '..\src\HashLibTestBase.pas',

HashLib/src/Checksum/HlpAdler32Dispatch.pas

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@ interface
1313
implementation
1414

1515
uses
16-
HlpCpuFeatures;
16+
HlpCpuFeatures,
17+
HlpSimdLevels;
1718

1819
const
1920
ModAdler = UInt32(65521);
@@ -188,28 +189,28 @@ procedure InitDispatch();
188189
begin
189190
Adler32_Update := @Adler32_Update_Scalar;
190191
{$IFDEF HASHLIB_I386_ASM}
191-
case TCpuFeatures.GetActiveLevel() of
192-
TCpuSimdLevel.SSSE3:
192+
case TCpuFeatures.X86.GetSimdLevel() of
193+
TX86SimdLevel.SSSE3:
193194
begin
194195
Adler32_Update := @Adler32_Update_Ssse3;
195196
end;
196-
TCpuSimdLevel.SSE2:
197+
TX86SimdLevel.SSE2:
197198
begin
198199
Adler32_Update := @Adler32_Update_Sse2;
199200
end;
200201
end;
201202
{$ENDIF}
202203
{$IFDEF HASHLIB_X86_64_ASM}
203-
case TCpuFeatures.GetActiveLevel() of
204-
TCpuSimdLevel.AVX2:
204+
case TCpuFeatures.X86.GetSimdLevel() of
205+
TX86SimdLevel.AVX2:
205206
begin
206207
Adler32_Update := @Adler32_Update_Avx2;
207208
end;
208-
TCpuSimdLevel.SSSE3:
209+
TX86SimdLevel.SSSE3:
209210
begin
210211
Adler32_Update := @Adler32_Update_Ssse3;
211212
end;
212-
TCpuSimdLevel.SSE2:
213+
TX86SimdLevel.SSE2:
213214
begin
214215
Adler32_Update := @Adler32_Update_Sse2;
215216
end;

HashLib/src/Checksum/HlpCRCDispatch.pas

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,8 @@ implementation
7878

7979
uses
8080
HlpConverters,
81-
HlpCpuFeatures;
81+
HlpCpuFeatures,
82+
HlpSimdLevels;
8283

8384
// =============================================================================
8485
// Scalar fallback implementation
@@ -494,15 +495,15 @@ procedure InitDispatch();
494495
CRC_Fold_UsesPclmul := False;
495496

496497
{$IFDEF HASHLIB_X86_64_ASM}
497-
if TCpuFeatures.HasVPCLMULQDQ() then
498+
if TCpuFeatures.X86.HasVPCLMULQDQ() then
498499
begin
499500
CRC_Fold_Lsb := @CRC_Fold_Vpclmul;
500501
CRC_Fold_Msb := @CRC_Fold_Vpclmul_Msb;
501502
CRC_Fold_Lsb32 := @CRC_Fold_Vpclmul;
502503
CRC_Fold_UsesPclmul := True;
503504
Exit;
504505
end;
505-
if TCpuFeatures.HasPCLMULQDQ() then
506+
if TCpuFeatures.X86.HasPCLMULQDQ() then
506507
begin
507508
CRC_Fold_Lsb := @CRC_Fold_Pclmul;
508509
CRC_Fold_Msb := @CRC_Fold_Pclmul_Msb;
@@ -514,14 +515,14 @@ procedure InitDispatch();
514515

515516
{$IFDEF HASHLIB_X86_SIMD}
516517
{$IFDEF HASHLIB_I386_ASM}
517-
case TCpuFeatures.GetActiveLevel() of
518-
TCpuSimdLevel.SSSE3, TCpuSimdLevel.SSE2:
518+
case TCpuFeatures.X86.GetSimdLevel() of
519+
TX86SimdLevel.SSSE3, TX86SimdLevel.SSE2:
519520
BindSse2CrcFold;
520521
end;
521522
{$ENDIF HASHLIB_I386_ASM}
522523
{$IFDEF HASHLIB_X86_64_ASM}
523-
case TCpuFeatures.GetActiveLevel() of
524-
TCpuSimdLevel.AVX2, TCpuSimdLevel.SSSE3, TCpuSimdLevel.SSE2:
524+
case TCpuFeatures.X86.GetSimdLevel() of
525+
TX86SimdLevel.AVX2, TX86SimdLevel.SSSE3, TX86SimdLevel.SSE2:
525526
BindSse2CrcFold;
526527
end;
527528
{$ENDIF HASHLIB_X86_64_ASM}

HashLib/src/Crypto/HlpBlake2BDispatch.pas

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ implementation
2222

2323
uses
2424
HlpBits,
25-
HlpCpuFeatures;
25+
HlpCpuFeatures,
26+
HlpSimdLevels;
2627

2728
const
2829
Blake2BSigma: array [0 .. 11, 0 .. 15] of Int32 = (
@@ -131,20 +132,20 @@ procedure InitDispatch();
131132
begin
132133
Blake2B_Compress := @Blake2B_Compress_Scalar;
133134
{$IFDEF HASHLIB_I386_ASM}
134-
case TCpuFeatures.GetActiveLevel() of
135-
TCpuSimdLevel.SSE2, TCpuSimdLevel.SSSE3:
135+
case TCpuFeatures.X86.GetSimdLevel() of
136+
TX86SimdLevel.SSE2, TX86SimdLevel.SSSE3:
136137
begin
137138
Blake2B_Compress := @Blake2B_Compress_Sse2;
138139
end;
139140
end;
140141
{$ENDIF}
141142
{$IFDEF HASHLIB_X86_64_ASM}
142-
case TCpuFeatures.GetActiveLevel() of
143-
TCpuSimdLevel.AVX2:
143+
case TCpuFeatures.X86.GetSimdLevel() of
144+
TX86SimdLevel.AVX2:
144145
begin
145146
Blake2B_Compress := @Blake2B_Compress_Avx2;
146147
end;
147-
TCpuSimdLevel.SSE2, TCpuSimdLevel.SSSE3:
148+
TX86SimdLevel.SSE2, TX86SimdLevel.SSSE3:
148149
begin
149150
Blake2B_Compress := @Blake2B_Compress_Sse2;
150151
end;

HashLib/src/Crypto/HlpBlake2SDispatch.pas

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ implementation
2222

2323
uses
2424
HlpBits,
25-
HlpCpuFeatures;
25+
HlpCpuFeatures,
26+
HlpSimdLevels;
2627

2728
const
2829
Blake2SSigma: array [0 .. 9, 0 .. 15] of Int32 = (
@@ -129,20 +130,20 @@ procedure InitDispatch();
129130
begin
130131
Blake2S_Compress := @Blake2S_Compress_Scalar;
131132
{$IFDEF HASHLIB_I386_ASM}
132-
case TCpuFeatures.GetActiveLevel() of
133-
TCpuSimdLevel.SSE2, TCpuSimdLevel.SSSE3:
133+
case TCpuFeatures.X86.GetSimdLevel() of
134+
TX86SimdLevel.SSE2, TX86SimdLevel.SSSE3:
134135
begin
135136
Blake2S_Compress := @Blake2S_Compress_Sse2;
136137
end;
137138
end;
138139
{$ENDIF}
139140
{$IFDEF HASHLIB_X86_64_ASM}
140-
case TCpuFeatures.GetActiveLevel() of
141-
TCpuSimdLevel.AVX2:
141+
case TCpuFeatures.X86.GetSimdLevel() of
142+
TX86SimdLevel.AVX2:
142143
begin
143144
Blake2S_Compress := @Blake2S_Compress_Avx2;
144145
end;
145-
TCpuSimdLevel.SSE2, TCpuSimdLevel.SSSE3:
146+
TX86SimdLevel.SSE2, TX86SimdLevel.SSSE3:
146147
begin
147148
Blake2S_Compress := @Blake2S_Compress_Sse2;
148149
end;

HashLib/src/Crypto/HlpBlake3Dispatch.pas

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@ implementation
2626

2727
uses
2828
HlpBits,
29-
HlpCpuFeatures;
29+
HlpCpuFeatures,
30+
HlpSimdLevels;
3031

3132
const
3233
Blake3IV: array [0 .. 3] of UInt32 = (
@@ -712,8 +713,8 @@ procedure InitDispatch();
712713
Blake3_HashMany := @Blake3_HashMany_Scalar;
713714
Blake3_ParallelDegree := 1;
714715
{$IFDEF HASHLIB_I386_ASM}
715-
case TCpuFeatures.GetActiveLevel() of
716-
TCpuSimdLevel.SSE2, TCpuSimdLevel.SSSE3:
716+
case TCpuFeatures.X86.GetSimdLevel() of
717+
TX86SimdLevel.SSE2, TX86SimdLevel.SSSE3:
717718
begin
718719
Blake3_Compress := @Blake3_Compress_Sse2;
719720
Blake3_HashMany := @Blake3_HashMany_Sse2;
@@ -722,14 +723,14 @@ procedure InitDispatch();
722723
end;
723724
{$ENDIF}
724725
{$IFDEF HASHLIB_X86_64_ASM}
725-
case TCpuFeatures.GetActiveLevel() of
726-
TCpuSimdLevel.AVX2:
726+
case TCpuFeatures.X86.GetSimdLevel() of
727+
TX86SimdLevel.AVX2:
727728
begin
728729
Blake3_Compress := @Blake3_Compress_Avx2;
729730
Blake3_HashMany := @Blake3_HashMany_Avx2;
730731
Blake3_ParallelDegree := 8;
731732
end;
732-
TCpuSimdLevel.SSE2, TCpuSimdLevel.SSSE3:
733+
TX86SimdLevel.SSE2, TX86SimdLevel.SSSE3:
733734
begin
734735
Blake3_Compress := @Blake3_Compress_Sse2;
735736
Blake3_HashMany := @Blake3_HashMany_Sse2;

HashLib/src/Crypto/HlpSHA1Dispatch.pas

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,8 @@ implementation
2727
uses
2828
HlpBits,
2929
HlpConverters,
30-
HlpCpuFeatures;
30+
HlpCpuFeatures,
31+
HlpSimdLevels;
3132

3233
// =============================================================================
3334
// Scalar fallback implementation
@@ -175,33 +176,33 @@ procedure InitDispatch();
175176
begin
176177
SHA1_Compress := @SHA1_Compress_Scalar;
177178
{$IFDEF HASHLIB_I386_ASM}
178-
case TCpuFeatures.GetActiveLevel() of
179-
TCpuSimdLevel.SSSE3:
179+
case TCpuFeatures.X86.GetSimdLevel() of
180+
TX86SimdLevel.SSSE3:
180181
begin
181182
SHA1_Compress := @SHA1_Compress_Ssse3_Wrap;
182183
end;
183-
TCpuSimdLevel.SSE2:
184+
TX86SimdLevel.SSE2:
184185
begin
185186
SHA1_Compress := @SHA1_Compress_Sse2;
186187
end;
187188
end;
188189
{$ENDIF}
189190
{$IFDEF HASHLIB_X86_64_ASM}
190-
if TCpuFeatures.HasSHANI() then
191+
if TCpuFeatures.X86.HasSHANI() then
191192
begin
192193
SHA1_Compress := @SHA1_Compress_ShaNi_Wrap;
193194
Exit;
194195
end;
195-
case TCpuFeatures.GetActiveLevel() of
196-
TCpuSimdLevel.AVX2:
196+
case TCpuFeatures.X86.GetSimdLevel() of
197+
TX86SimdLevel.AVX2:
197198
begin
198199
SHA1_Compress := @SHA1_Compress_Avx2_Wrap;
199200
end;
200-
TCpuSimdLevel.SSSE3:
201+
TX86SimdLevel.SSSE3:
201202
begin
202203
SHA1_Compress := @SHA1_Compress_Ssse3_Wrap;
203204
end;
204-
TCpuSimdLevel.SSE2:
205+
TX86SimdLevel.SSE2:
205206
begin
206207
SHA1_Compress := @SHA1_Compress_Sse2;
207208
end;

HashLib/src/Crypto/HlpSHA2_256Dispatch.pas

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,8 @@ implementation
3939
uses
4040
HlpBits,
4141
HlpConverters,
42-
HlpCpuFeatures;
42+
HlpCpuFeatures,
43+
HlpSimdLevels;
4344

4445
// =============================================================================
4546
// Scalar fallback implementation
@@ -185,33 +186,33 @@ procedure InitDispatch();
185186
begin
186187
SHA256_Compress := @SHA256_Compress_Scalar;
187188
{$IFDEF HASHLIB_I386_ASM}
188-
case TCpuFeatures.GetActiveLevel() of
189-
TCpuSimdLevel.SSSE3:
189+
case TCpuFeatures.X86.GetSimdLevel() of
190+
TX86SimdLevel.SSSE3:
190191
begin
191192
SHA256_Compress := @SHA256_Compress_Ssse3_Wrap;
192193
end;
193-
TCpuSimdLevel.SSE2:
194+
TX86SimdLevel.SSE2:
194195
begin
195196
SHA256_Compress := @SHA256_Compress_Sse2_Wrap;
196197
end;
197198
end;
198199
{$ENDIF}
199200
{$IFDEF HASHLIB_X86_64_ASM}
200-
if TCpuFeatures.HasSHANI() then
201+
if TCpuFeatures.X86.HasSHANI() then
201202
begin
202203
SHA256_Compress := @SHA256_Compress_ShaNi_Wrap;
203204
Exit;
204205
end;
205-
case TCpuFeatures.GetActiveLevel() of
206-
TCpuSimdLevel.AVX2:
206+
case TCpuFeatures.X86.GetSimdLevel() of
207+
TX86SimdLevel.AVX2:
207208
begin
208209
SHA256_Compress := @SHA256_Compress_Avx2_Wrap;
209210
end;
210-
TCpuSimdLevel.SSSE3:
211+
TX86SimdLevel.SSSE3:
211212
begin
212213
SHA256_Compress := @SHA256_Compress_Ssse3_Wrap;
213214
end;
214-
TCpuSimdLevel.SSE2:
215+
TX86SimdLevel.SSE2:
215216
begin
216217
SHA256_Compress := @SHA256_Compress_Sse2_Wrap;
217218
end;

0 commit comments

Comments
 (0)