Skip to content

joematrix77/UbuntuROCmSetup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Ubuntu 24.04 AMD AI Development Stack

Optimized for RDNA 4 (RX 9060 XT) & Ryzen 9000 Series

This repository contains a comprehensive deployment script to set up a high-performance AI training environment on Ubuntu. It specifically addresses the 2026 "Greedy Allocation" bugs and library mismatches found in the ROCm 7.2.0 stack.

🚀 Key Features (2026 Edition)

Architecture Support: Native targeting for gfx1200 (Navi 44 / RX 9060 XT). Kernel Optimization: Automated pathing for Linux Kernel 6.18.7+ (Required for RDNA 4 VRAM mapping). Memory Fixes: Pre-configured PYTORCH_ALLOC_CONF to prevent 15GB pre-allocation crashes on 16GB cards. Library Alignment: Hard-links System ROCm 7.2.0 libraries to PyTorch Nightly binaries to resolve versioning conflicts.

📋 Prerequisites

Hardware: AMD Radeon RX 9000 Series GPU (16GB+ VRAM recommended). OS: Ubuntu 24.04 LTS or 26.04. BIOS: Secure Boot MUST be disabled (to allow the unsigned 6.18 mainline kernel to boot).

🛠️ Installation

1. Clone the repository

git clone https://github.com cd UbuntuROCm9060XTSetup.sh

2. Make the script executable

chmod +x UbuntuROCm9060XTSetup.sh

3. Run the deployment

./UbuntuROCm9060XTSetup.sh

🧪 Post-Installation Benchmark

Once the script finishes and you have rebooted into the new kernel, verify your speed using the included matrix multiplication test:

source ~/ai_env/bin/activate python3 -c "import torch; size=10000; a=torch.randn(size,size,device='cuda',dtype=torch.float16); b=torch.randn(size,size,device='cuda',dtype=torch.float16); import time; start=time.time(); [torch.matmul(a,b) for _ in range(100)]; torch.cuda.synchronize(); print(f'Done! Took: {time.time()-start:.2f}s')"

Expected Result on RX 9060 XT: < 1.5 seconds.

🔧 Critical Environment Variables

The script automatically injects these into your ai_env/bin/activate file: Variable Value Reason HSA_OVERRIDE_GFX_VERSION 12.0.0 Enables RDNA 4 instruction set PYTORCH_ALLOC_CONF max_split_size_mb:32 Bypasses 15GB VRAM reservation bug LD_LIBRARY_PATH /opt/rocm-7.2.0/lib Links PyTorch to System ROCm 7.2 TORCH_ROCM_AOTRITON 1 Enables experimental Flash Attention

⚠️ Troubleshooting 2026 Issues

HIP Error: Out of Memory: This is usually a "zombie" process from a previous crash. Run echo 1 | sudo tee /sys/kernel/debug/dri/0/amdgpu_gpu_recover to reset the VRAM map. SCLK 0Mhz: The GPU is in deep sleep. The script includes a sudo amd-smi set --perf-level high command to wake the card before compute tasks.

📄 License This project is licensed under the MIT License - see the LICENSE file for details.

About

Build Steps for ROCm on Ubuntu 24.04

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages