[Bug report] Persistent amdgpu flip_done timeouts and system freezes across kernels (Fedora 44)
Hi everyone,
I’m seeking advice on a persistent system freeze/hang issue on Fedora 44 that is affecting both the 6.19.10 and 7.0.9 kernel series. I’ve ruled out NVIDIA driver issues (they aren't even probed at the time of the crash), and the logs point to a failure in the AMD display engine.
Hardware context:
- Laptop with AMD Integrated Graphics + NVIDIA RTX 4060.
- OS: Fedora 44 (Stable).
- Kernels tested:
6.19.10-300.fc44.x86_64and7.0.9-202.fc44.x86_64.
The Issue: The system freezes randomly. Upon rebooting, journalctl -b -1 -p err reveals recurring amdgpu errors:
amdgpu 0000:65:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic dataamdgpu 0000:65:00.0: [drm] *ERROR* [CRTC:363:crtc-0] flip_done timed out
These freezes are often preceded by Vulkan swapchain errors from apps like Ptyxis and Brave Browser (vkAcquireNextImageKHR(): A swapchain no longer matches the surface properties exactly), which seems to trigger the display driver to hang.
What I have tried:
- Confirmed this is not an NVIDIA driver/Secure Boot issue (the proprietary modules were blacklisted for testing, yet the crashes persist).
- Verified the issue occurs on both the latest stable kernel and the newer 7.0 kernel.
- Suspect it is a power-management state conflict between the kernel's AMD display controller and the laptop's firmware.
Questions:
- Is there a known regression or specific kernel parameter to stabilize the
amdgpuDMUB on newer kernels for this hardware architecture? - Has anyone else seen these
flip_donetimeouts triggered by Vulkan-based apps?
Any insight would be appreciated before I open a formal bug on Bugzilla.