As of RHEL 9 (and with growing support in RHEL 8.10+), Red Hat provides official capabilities for AI/ML workloads, including support for GPU acceleration via ROCm (for AMD GPUs) and NVIDIA CUDA (for NVIDIA GPUs). However, the level of support, installation methods, and available tooling differ significantly between the two ecosystems.
Below is a structured overview of AI-related installation options on RHEL, including ROCm, CUDA, and complementary AI/ML frameworks.
1. NVIDIA CUDA on RHEL
Official Support
- Fully supported by Red Hat in partnership with NVIDIA.
- Available via RHEL NVIDIA GPU drivers and CUDA repositories.
- Certified for use with OpenShift AI, Red Hat Enterprise Linux AI (RHEL AI), and NVIDIA AI Enterprise.
Installation Options
Option A: Use RHEL NVIDIA Driver + CUDA from NVIDIA
- Enable required repos:bash12
- Install EPEL and kernel headers:bash12
- Download and install NVIDIA driver + CUDA Toolkit from NVIDIA’s site (select RHEL 9).
- Or use NVIDIA’s RPM repo:bash12
Option B: Use NVIDIA AI Enterprise (NVAIE) on RHEL
- Fully supported stack (drivers, CUDA, cuDNN, TensorRT, etc.).
- Requires NVIDIA enterprise license.
- Distributed via NGC Catalog and integrated with OpenShift AI.
Note: RHEL 9.2+ includes DKMS support, making NVIDIA driver updates more robust.
2. AMD ROCm on RHEL
Limited / Community-Level Support
- Not officially certified by Red Hat for RHEL (as of early 2026).
- AMD provides community-supported ROCm packages for RHEL 9, but not all RHEL versions or kernels are compatible.
- Best support on RHEL 9.2–9.4 with specific AMD GPUs (e.g., MI210, MI250, RX 7900 XT).
Installation Steps (Community Approach)
- Check GPU compatibility:
Only select AMD Instinct™ and Radeon Pro GPUs are supported. Consumer cards (e.g., RX 6000/7000) have partial or no support. - Enable required repos:
- Add ROCm repo (from AMD):
- Install ROCm
- Reboot and verify
Known Issues:
- Kernel updates may break ROCm (requires reinstall or DKMS setup).
- SELinux may block GPU access—temporary workaround:
setenforce 0(not recommended in production).- No official Red Hat support; use at your own risk in enterprise environments.
3. RHEL AI (Red Hat Enterprise Linux AI)
Introduced in 2024, RHEL AI is a subscription-based offering that includes:
- InstructLab – Open-source framework for local LLM training and tuning (based on IBM’s Granite models).
- Optimized AI toolchain (PyTorch, vLLM, llama.cpp, etc.).
- Integration with OpenShift AI for scalable MLOps.
- Support for both NVIDIA and (future) AMD accelerators.
How to Install RHEL AI:
- Subscribe to RHEL AI via Red Hat Customer Portal.
- Enable the AI repository: “sudo subscription-manager repos –enable=rhel-ai-for-x86_64-rpms”
- Install InstructLab:
- “sudo dnf install ilab
- ilab config init
- ilab model download”
This is the recommended path for enterprise AI on RHEL, especially for NVIDIA-based systems.
4. Other AI/ML Tools on RHEL
| Tool/Framework | Availability on RHEL |
|---|---|
| PyTorch | Via PyPI, Conda, or RHEL AI repos |
| TensorFlow | Community builds; not in base RHEL repos |
| JupyterLab | Available in AppStream (dnf install jupyterlab) |
| OpenVINO | Intel’s toolkit; supported on RHEL |
| ONNX Runtime | Available via PyPI or Microsoft RPMs |
Recommendations by Use Case
| Use Case | Recommended Stack |
|---|---|
| Enterprise AI (production) | RHEL AI + NVIDIA GPU + OpenShift AI |
| Research / Dev (NVIDIA) | CUDA + PyTorch/TensorFlow from NVIDIA |
| AMD GPU experimentation | ROCm (community) + PyTorch (HIP) |
| CPU-only LLMs | RHEL AI + InstructLab (no GPU needed) |