r/MacPro2019LocalAI

▲ 13 r/MacPro2019LocalAI+5 crossposts

Mac Pro 2019 Local AI Guide: Ubuntu 24.04, ROCm 7.2.3, PyTorch 2.10, and Infinity Fabric Link

I am very excited about the future of local AI. With the spread of AI agents, the amount of VRAM now achievable locally, the quality of small and medium LLMs, and the community growing around all of this, the future is looking very good.

I am writing this to document my successes with the following:

  • Mac Pro 2019
  • Ubuntu 24.04.4 LTS (Ubuntu Server specifically, in my case)
  • Dual AMD Radeon PRO W6900X with Infinity Fabric Link Bridge
  • Dual AMD Radeon PRO W6800X Duo with Infinity Fabric Link Bridge
  • ROCm 7.2.3
  • PyTorch 2.10
  • Triton 3.6
  • vLLM (Write up pending)
  • Hermes Agent (Research Pending)

I wrote a couple of old guides. Check them out for reference, as needed:

I'm going to focus on setting up Ubuntu and all the packages needed for the infrastructure of local AI.

Important: This is an experimental community guide. Some parts involve patched kernels, unsupported GPU configurations, and boot-level PCIe changes. This worked for my Mac Pro 2019 systems, but you should expect troubleshooting, and you should be comfortable recovering from a failed boot. I am not responsible for any outcome of using this guide, whether it be positive, negative, or anything in between.

1. Choices & Decisions

  • Mac Pro 2019: It's what I had available to me.
  • W6900X: It's what I had available to me.
  • W6800X Duo: It's what I had available to me.
  • Ubuntu LTS: The ROCm-supported OS family I am most comfortable with. Alternative: RHEL
  • Ubuntu 24.04 LTS: The latest Ubuntu LTS version supported by ROCm at the time of writing. Alternative: Ubuntu 22.04 LTS
  • Ubuntu Server: To avoid desktop overhead and keep the system headless. Alternative: Ubuntu Desktop LTS
  • Data Room: I placed the Macs in a Data Room, so I don't hear the loud fans. Alternative: Place it at your desk, or anywhere else.
  • DRM/AMDGPU: I opted to use the GPU driver in the kernel, to patch it to support the Infinity Fabric Link Bridge. Alternative: Install DKMS and AMDGPU.
  • Kernel: Patched Ubuntu 6.17 HWE kernel, based on Ubuntu’s linux-hwe-6.17 source package, to support the Infinity Fabric Link Bridge. Alternative: Standard Ubuntu kernel.
  • ROCm: AMD’s CUDA alternative for AMD GPUs. Alternative: Vulkan
  • ROCm 7.2.3: Latest ROCm that supports my GPUs at the time of writing. Alternative: Outdated ROCm.
  • vLLM: Concurrent utilization of loaded LLMs. Alternative: Ollama & Llama.cpp
  • Hermes Agent: More tool-savvy and self-learning. Alternative: OpenClaw
  • GitHub: All my files and commands have been uploaded to GitHub, to make this guide shorter than 40,000 characters. Alternative: Multiple Guides...

Please let me know if the GitHub links do not work.

These are the choices I made, and I am still refining them. They work for me. Keep in mind that this is all held together with the digital equivalent of duct tape. If you change anything, it may or may not work. If you do, I would genuinely appreciate hearing what you tried, what worked, what failed, and why you changed it.

2. Setting up Ubuntu after Installation

Step 00: Infinity Fabric Link (Jumper & Bridge)

Please remove the Infinity Fabric Link Jumper(s) or Bridge from the GPU. Ubuntu 24 kernels do not currently support it, as of 6.17.

Specifically, with kernel 6.8, none of the GPUs will work. When upgrading to 6.17, only one GPU will work.

If you have an Infinity Fabric Link Jumper or Bridge, follow the patch section later in the guide to make it work with your GPUs.

Step 01: Update, Upgrade, and Tweak the System

What we will do:

  • Change ubuntu.sources from http to https
  • Attach to Ubuntu Pro (This is optional, and requires interaction)
  • Update & Full-Upgrade
  • Upgrade to the latest HWE kernel
  • Remove cloud-init
  • Make all Ethernet ports accept DHCPv4 automatically
  • Modify Grub to include "loglevel=7 log_buf_len=16M iommu=pt" kernel flags
  • Reboot

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/2.%20Setting%20up%20Ubuntu%20after%20Installation/Step%2001%3A%20Update%2C%20Upgrade%2C%20and%20Tweak%20the%20System" | bash

Step 02: Install T2 Linux Repository

Since we are using a Mac Pro 2019, which is a Mac with a T2 chip, some additional packages are required to be able to properly communicate with the hardware.

What we will do:

  • Set up the T2 Ubuntu 24 (Noble) Repository
  • Install 3 Packages: applesmc-t2 apple-bce t2fanrd
  • Reboot

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/2.%20Setting%20up%20Ubuntu%20after%20Installation/Step%2002%3A%20Install%20T2%20Linux%20Repository" | bash

Step 03: Enable T2 Fan Daemon

After installing the T2 packages, the command below is used to activate the fan service.

What we will do:

  • Enable the t2fanrd systemd service

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/2.%20Setting%20up%20Ubuntu%20after%20Installation/Step%2003%3A%20Enable%20T2%20Fan%20Daemon" | bash

Step 03-Optional: Set Fans to Maximum

I do not trust Apple Cooling. I would rather the fans wear out and replace them for a few dollars, versus the GPUs (especially the Duo models) being damaged due to overheating.

What we will do:

  • Set all 4 fans to maximum speed

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/2.%20Setting%20up%20Ubuntu%20after%20Installation/Step%2003-Optional%3A%20Set%20Fans%20to%20Maximum" | bash

Step 04: Download and Install ROCm 7.2.3

This section will install ROCm 7.2.3, but it will NOT install dkms or amdgpu drivers. I opted to use the kernel driver, drm/amdgpu, so I can later patch it to support the Infinity Fabric Link Bridge.

What we will do:

  • Make a new directory to save all downloaded files
  • Download ROCm installer
  • Install ROCm Dependencies
  • Install ROCm
  • Give all users access to ROCm
  • Add ROCm to path
  • Show you a bunch of output displaying your GPUs, which are working with ROCm or the driver, etc.
  • Reboot

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/2.%20Setting%20up%20Ubuntu%20after%20Installation/Step%2004%3A%20Download%20and%20Install%20ROCm%207.2.3" | bash

Step 05: Install Python Tools

We will be using Python and pip to install several packages for local AI. The following commands are to set up the correct versions, as well as some quality of life choices.

What we will do:

  • Install these packages: 2to3 python-is-python3 python3-pip python3-venv python3-dev python3-setuptools
  • Install or upgrade these packages, system wide: pip wheel setuptools
  • Install numpy 1.26.4 specifically, system wide

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/2.%20Setting%20up%20Ubuntu%20after%20Installation/Step%2005%3A%20Install%20Python%20Tools" | bash

Step 06: Install PyTorch & Other ROCm Related Wheels

Not everything here is needed for everyone. I included what I could, what worked, and what had some value to some local AI use case.

What we will do:

  • Install PyTorch Wheels
  • Add AMD ROCm APT Repository
  • Set AMD ROCm Apt Repository at priority 700 (Higher than Ubuntu)
  • Fix some ROCm Symlinks conflicting with MIGraphX
  • Install MIGraphX & Half packages
  • Install ONNX Runtime package
  • Install TensorFlow ROCm package
  • Install Apex Wheel
  • Clean up packages
  • Reboot

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/2.%20Setting%20up%20Ubuntu%20after%20Installation/Step%2006%3A%20Install%20PyTorch%20%26%20Other%20ROCm%20Related%20Wheels" | bash

Step 07: Verifying Everything

We just completed installing everything in the standard way. We just need to verify that everything is now set up correctly.

What we will do:

  • Give you several boxes showing the status of everything we just set up

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/2.%20Setting%20up%20Ubuntu%20after%20Installation/Step%2007%3A%20Verifying%20Everything" | bash

3. Infinity Fabric Link Jumper / Bridge

AMD released several GPUs specifically for the Mac Pro 2019 that support their Infinity Fabric.

These GPUs and the Infinity Fabric Links are discussed in these posts:

The first set of GPUs that support it were the AMD Radeon PRO Vega II & Vega II Duo. The PC equivalent is an AMD Radeon PRO VII, which also supports an Infinity Fabric Link.

The second set of GPUs are the AMD Radeon PRO W6800X, W6800X Duo, and W6900X. These GPUs are in the Sienna Cichlid family of GPUs. Also referred to as RDNA2.

At the announcement of the Sienna Cichlid family, these GPUs were marketed as supporting xGMI. The Infinity Fabric Link is the physical bridge / jumper. xGMI is the software path that allows the GPUs to communicate over that link. However, on release, only the Apple MPX GPUs actually supported the Infinity Fabric Links, while the standard versions did not.

This might explain why support for xGMI on Sienna Cichlid was added between 2019 and 2020 to the Linux kernel drm/amdgpu, but later removed in 2022.

Many of us here in the subreddit tried to figure out the problem with the Infinity Fabric Link, and tried to find a solution to it. One such redditor actually cracked it; creating a patch to the current kernel drm/amdgpu driver, which through my testing seems to have completely solved the Infinity Fabric Link regression that happened in 2022.

You'll need to keep in mind that this is just the first step. While we are moving forward, there is still the question of ROCm support, HIP support, and everything else.

Step 01: Download, Build, & Install the Patched Kernel Files

Let's start. We will do the following:

  • Make a directory to download kernel source
  • Install packages required to patch the kernel
  • Activate the source to download kernel source
  • Patch drm/amdgpu
  • Build a full patched kernel
  • Install the patched kernel

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/3.%20Infinity%20Fabric%20Link%20Jumper-Bridge/Step%2001%3A%20Download%2C%20Build%2C%20%26%20Install%20the%20Patched%20Kernel%20Files" | bash

With this, you are now the proud user of a patched kernel that supports the Infinity Fabric Links on the Sienna Cichlid MPX GPUs.

At this point, shut the system down, reinstall the Infinity Fabric Link Jumper or Bridge, then boot back into the patched kernel.

Step 02: Verify Patched Kernel & GPU Initialization

We should probably run a verification one last time. Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/3.%20Infinity%20Fabric%20Link%20Jumper-Bridge/Step%2002%3A%20Verify%20Patched%20Kernel%20%26%20GPU%20Initialization" | bash

While more testing is still needed, this is quite the achievement for the community. Thank you again, anonymous redditor.

4. AMD Duo MPX GPUs and Setting BAR Correctly

I have been using my Mac Pro 2019 with Dual AMD Radeon PRO W6800X Duo for local AI inference for some time now, and I have not had any BAR-related problems. However, since I moved from using Proxmox to having Ubuntu 24 on bare-metal, I have started noticing some BAR warnings and errors.

It seems that this problem may come from the way the Mac Pro firmware allocates PCIe resources before Linux takes over, specifically when using Duo MPX GPUs.

One redditor, whose account is now deleted, shared a GitHub link to what I can only describe as someone's documentation of how he fixed the BAR issue on Vega II Duo GPUs. I have dubbed this the nbritton's method.

Our goal now is to use nbritton's method, adapted for the W6800X Duo. I tried to make it also work as a copy and paste solution for the Vega II Duo as well, but I have not tested it.

Warning: This changes GPU driver load order and PCIe BAR allocation behavior. If something goes wrong, you may need to boot from a recovery kernel, remove the service, or undo the GRUB changes. Also, note that SGLang's AMD GPU documentation recommends pci=realloc=off iommu=pt, which conflicts with nbritton's method because nbritton's method depends on PCIe BAR reallocation behavior. In other words, pci=realloc must not be disabled for this method.

Let's start.

We will do the following:

  • Blacklist amdgpu
  • Add pci=realloc to grub
  • Configure resize-gpu-bars.service
  • Set up nbritton's method files

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/4.%20AMD%20Duo%20MPX%20GPUs%20and%20Setting%20BAR%20Correctly" | bash

5. Finalize the Infrastructure

After completing the linked sections above, we should have:

  • Install Ubuntu (You did this on your own or using a previous guide)
  • Prepare Ubuntu's environment
  • Set up T2 related environment
  • Installed ROCm
  • Installed PyTorch and several other local AI optimizing software
  • Patched the kernel (linux-hwe-6.17, source 6.17.0-29.29~24.04.1) to support xGMI and the Infinity Fabric Link Bridge and Jumper.
  • Set up nbritton's method for Duo MPX GPUs BAR correction

Once you're done, please reboot to make sure everything sticks. Then repeat step 07: Verify Everything, above to verify everything is correct and as it should be.

6. Local AI

Now that the infrastructure is ready, it's time to move to our frameworks of choice.

While I definitely plan to expand, I have focused mainly on text generation. When I first started, consideration was Ollama, Llama.cpp, and vLLM. I see new options now, such as SGLang as well.

I am excited to share that vLLM supports this setup and works well. I hope to release a separate guide for it soon.

For the purpose of this guide, I will continue with Ollama, for the simplicity of it, and a Hello World type scenario.

Step 01: Install and Configure Ollama

We will do the following:

  • Set up Ollama
  • Fix ollama.service vs. ollama serve separate model libraries

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/6.%20Local%20AI/Step%2001%3A%20Install%20and%20Configure%20Ollama" | bash

Step 02: Verify Ollama Setup

We will do the following:

  • Verify Ollama services and data folders permissions

Copy the following command into your command line interface of choice:

curl -fsSL "https://raw.githubusercontent.com/FaisalBiyari/MacPro2019LocalAI/refs/heads/main/Reddit/Mac%20Pro%202019%20Local%20AI%20Guide%3A%20Ubuntu%2024.04%2C%20ROCm%207.2.3%2C%20PyTorch%202.10%2C%20and%20Infinity%20Fabric%20Link/6.%20Local%20AI/Step%2002%3A%20Verify%20Ollama%20Setup" | bash

Step 03: Download and Run Models

We will do the following:

  • Download and run our first model

Copy the following command into your command line interface of choice:

ollama run qwen3.5:0.8b --verbose

You can find more models on Ollama's website. Below are some other models I am considering:

ollama pull qwen3.6:27b
ollama pull gemma4:31b-it-q4_K_M
ollama pull granite4.1:30b
ollama pull medgemma:27b
ollama pull mistral-medium-3.5:128b
ollama pull gpt-oss:120b
ollama pull qwen3.5:122b
ollama pull nemotron-3-super:120b

7. Done

With this, we are done with this guide.

It has been a long journey setting up this infrastructure, and preparing for the actual goal.

My testing was done on Mac Pro 2019 systems with dual W6900X MPX modules and dual W6800X Duo MPX modules. I have not tested this with Vega II or Vega II Duo MPX GPU modules.

Next, I plan to focus on vLLM for a while. Optimization, quantization, and automation of operations.

After that, I hope to dive into Hermes Agent by Nous, with the hope of building multiple agents around a few local models run on vLLM, communicating and working together.

Expanding to images or vision, as well as to voice, is also down the pipeline.

The possibilities are endless. I hope to hear what everyone else experiences with this guide and with local AI in general: what worked, what failed, what workloads you are running, what use cases you care about, what problems you hit, and what solutions you found.

Looking forward to seeing how everyone takes advantage of this guide, and local AI.

8. Credit

Credit where credit is due. A lot of the information here was gathered from the community in bits and pieces.

I do want to take the opportunity to thank the anonymous redditor for his/her contribution (creating the whole kernel patch). THANK YOU!

  • Nikolas Britton for the nbritton method, fixing the BAR issue on the AMD Duo MPX GPUs.

  • u/AdityaGarg8 for always being supportive, no questions asked.

  • My AI of choice, for the support through all of this.

  • r/MacPro2019LocalAI redditors, for keeping in touch, and motivating me to continue going. You guys are the real MVPs.


Disclaimer: I wrote this post myself. I also used AI as a tool to help clean up the wording and formatting.

Resources:

reddit.com
u/Faisal_Biyari — 1 day ago