Skip to content

[BugFix][LLVM] Fix insertDeclare API mismatch for ROCm-bundled LLVM 20#19390

Draft
swjng wants to merge 1 commit intoapache:mainfrom
swjng:fix/llvm20-insertdeclare-api
Draft

[BugFix][LLVM] Fix insertDeclare API mismatch for ROCm-bundled LLVM 20#19390
swjng wants to merge 1 commit intoapache:mainfrom
swjng:fix/llvm20-insertdeclare-api

Conversation

@swjng
Copy link
Copy Markdown
Contributor

@swjng swjng commented Apr 11, 2026

Problem

Closes #18709.

ROCm ships its own LLVM 20 build that still uses the pre-LLVM-20 Instruction* overload of DIBuilder::insertDeclare, while upstream LLVM 20 switched to BasicBlock::iterator. The previous guard #if TVM_LLVM_VERSION >= 200 in codegen_llvm.cc broke ROCm users because their LLVM reports version 20 but lacks the new overload, causing a compile error like:

```
error: no matching function for call to
'llvm::DIBuilder::insertDeclare(..., llvm::BasicBlock::iterator)'
note: candidate: insertDeclare(..., llvm::Instruction*)
```

Fix

Replace the version-number guard with a CMake check_cxx_source_compiles feature probe in cmake/utils/FindLLVM.cmake. The probe compiles a small test program that calls insertDeclare with BasicBlock::iterator. The result is exposed as the preprocessor macro TVM_LLVM_INSERTDECLARE_USES_ITERATOR.

cmake_push_check_state/cmake_pop_check_state is used to avoid leaking CMAKE_REQUIRED_* variables into the rest of the build.

Three call sites in src/target/llvm/codegen_llvm.cc are updated to use #if TVM_LLVM_INSERTDECLARE_USES_ITERATOR instead of #if TVM_LLVM_VERSION >= 200.

Behavior by configuration:

LLVM build Probe result Macro set Code path
Upstream LLVM >= 20 with new overload Compiles Yes BasicBlock::iterator
ROCm LLVM 20 (old overload) Fails No Instruction*
Any LLVM < 20 Probe skipped No Instruction*

Testing

Tested locally with LLVM 20.1.8 (Ubuntu package). The probe correctly detects Failed (uses Instruction* path), libtvm.so builds cleanly, and all 39 LLVM codegen tests pass:

```
tests/python/codegen/test_target_codegen_llvm.py 39 passed
```

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several fixes and improvements to the TVM build and compilation pipelines. It adds CMake feature detection for LLVM 20's DIBuilder::insertDeclare to handle differences between upstream and ROCm-bundled versions. In the Relax VM build, it now prefers target-specific pipelines when using the "default" configuration. The compile_relax integration for MetaSchedule was refactored to ensure a correct pass ordering, including DLight fallback scheduling. Additionally, safety checks were added to TIRX passes to handle functions without defined targets. Feedback suggests improving the target-specific pass detection in the MetaSchedule integration to support backends beyond CUDA, avoiding manual duplication of pass lists, and ensuring CMake variable state is preserved during feature detection.

Comment on lines +430 to +438
try:
from tvm.relax.backend.cuda.pipeline import ( # pylint: disable=import-outside-toplevel
library_dispatch_passes,
dataflow_lower_passes,
finalize_passes,
)
backend_specific = True
except ImportError:
backend_specific = False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic for detecting backend_specific passes is currently tied to the ability to import CUDA-specific pipeline modules. This will fail to detect backend-specific passes for other targets like ROCm, which is the primary focus of this PR. Instead of manual imports and try-except blocks, you should leverage the target-aware helper functions already available in tvm.relax.pipeline (e.g., library_dispatch_passes, dataflow_lower_passes, and finalize_passes). These helpers internally handle the dispatching to CUDA, ROCm, and other backends based on the provided target.

Comment on lines +246 to +258
set(CMAKE_REQUIRED_INCLUDES ${LLVM_INCLUDE_DIRS})
set(CMAKE_REQUIRED_FLAGS "-std=c++17 -fno-rtti")
check_cxx_source_compiles("
#include <llvm/IR/DIBuilder.h>
#include <llvm/IR/Instructions.h>
void f(llvm::DIBuilder* b, llvm::Value* v, llvm::DILocalVariable* var,
llvm::DIExpression* e, const llvm::DILocation* dl,
llvm::Instruction* i) {
b->insertDeclare(v, var, e, dl, llvm::BasicBlock::iterator(i));
}
" TVM_LLVM_INSERTDECLARE_USES_ITERATOR)
unset(CMAKE_REQUIRED_INCLUDES)
unset(CMAKE_REQUIRED_FLAGS)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Setting and then unsetting CMAKE_REQUIRED_INCLUDES and CMAKE_REQUIRED_FLAGS can lead to side effects if these variables had values before the find_llvm macro was called. It is safer to save the original values and restore them afterwards to avoid affecting other parts of the build system.

    set(OLD_CMAKE_REQUIRED_INCLUDES ${CMAKE_REQUIRED_INCLUDES})
    set(OLD_CMAKE_REQUIRED_FLAGS ${CMAKE_REQUIRED_FLAGS})
    set(CMAKE_REQUIRED_INCLUDES ${LLVM_INCLUDE_DIRS})
    set(CMAKE_REQUIRED_FLAGS "-std=c++17 -fno-rtti")
    check_cxx_source_compiles("
      #include <llvm/IR/DIBuilder.h>
      #include <llvm/IR/Instructions.h>
      void f(llvm::DIBuilder* b, llvm::Value* v, llvm::DILocalVariable* var,
             llvm::DIExpression* e, const llvm::DILocation* dl,
             llvm::Instruction* i) {
        b->insertDeclare(v, var, e, dl, llvm::BasicBlock::iterator(i));
      }
    " TVM_LLVM_INSERTDECLARE_USES_ITERATOR)
    set(CMAKE_REQUIRED_INCLUDES ${OLD_CMAKE_REQUIRED_INCLUDES})
    set(CMAKE_REQUIRED_FLAGS ${OLD_CMAKE_REQUIRED_FLAGS})

Comment on lines +461 to +475
lower = (dataflow_lower_passes(target) if backend_specific else [
relax.transform.RewriteDataflowReshape(),
relax.transform.ToNonDataflow(),
relax.transform.RemovePurityChecking(),
relax.transform.CallTIRRewrite(),
])
finalize = (finalize_passes(target) if backend_specific else [
relax.transform.StaticPlanBlockMemory(),
relax.transform.LowerAllocTensor(),
relax.transform.KillAfterLastUse(),
relax.transform.LowerRuntimeBuiltin(),
relax.transform.ComputePrimValue(),
relax.transform.VMShapeLower(),
relax.transform.AttachGlobalSymbol(),
])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fallback pass lists for lower and finalize are manually duplicated here. It would be cleaner to retrieve these from a central location (like tvm.relax.pipeline.default_build_pipeline) or define them as shared constants to ensure consistency across the codebase and reduce maintenance overhead.

@swjng swjng marked this pull request as draft April 11, 2026 14:58
ROCm ships its own LLVM 20 build that still uses the pre-LLVM-20
`Instruction*` overload of `DIBuilder::insertDeclare`, while upstream
LLVM 20 switched to `BasicBlock::iterator`. The previous version-number
guard `#if TVM_LLVM_VERSION >= 200` broke ROCm users on LLVM 20.

Replace the version guard with a CMake `check_cxx_source_compiles`
feature probe (`TVM_LLVM_INSERTDECLARE_USES_ITERATOR`) that compiles a
small test program to detect which overload is actually available.
Use `cmake_push_check_state`/`cmake_pop_check_state` to avoid leaking
CMake check variables. Three call sites in `codegen_llvm.cc` are
updated to use the new macro.

Fixes apache#18709
@swjng swjng force-pushed the fix/llvm20-insertdeclare-api branch from a7ceb05 to 70e8887 Compare April 11, 2026 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Compile the ROCM backend,compile failed

1 participant