u/JWCurtis94

Hi all, I’ll be honest upfront: I used AI to help write this because I’ve been troubleshooting this issue for months and couldn’t face typing the whole thing out from scratch. I’ve tried to include as much useful detail as possible.

I’m looking for help diagnosing a long-running PC stability issue. I feel like I’ve gone in circles with it, and I’m hoping someone with more experience might spot something I’m missing.

System specs

CPU: Intel Core i7-14700K

GPU: Nvidia RTX 3080 Ti Founders Edition

Motherboard: ASUS TUF Gaming Z790-Plus D4

RAM: 32GB DDR4, 4 x 8GB Corsair CMW16GX4M2C3200C16

OS: Windows 11 Pro

Storage: NVMe SSDs, around 5TB total across 2 drives

PSU: Corsair 1200W SFX. Around 4 years old

BIOS: Updated to latest version

GPU driver: Nvidia 596.36

GPU power: RTX 3080 Ti FE using Nvidia 12-pin adapter, fed by two separate PCIe cables from the PSU, not daisy-chained

The issue

The PC randomly hard crashes during gaming.

It does not behave like a normal game crash or normal BSOD. The usual behaviour is:

Screen goes black

System fully freezes or reboots

Headphones sometimes make an infinite “brrrrrrrr” noise

No proper crash dump is created

No minidump

No MEMORY.DMP

Reliability Monitor shows nothing useful

Event Viewer only shows aftermath-type events

The common Event Viewer entries are things like:

Kernel-Power 41

EventLog 6008 unexpected shutdown

volmgr 161 dump file creation failed after at least one crash

Windows does not seem to capture the actual cause of the failure. That makes me think the system is dying too abruptly for Windows to write a dump, or something hardware-level is happening before Windows can log it properly.

Crash dump settings already checked

I know people often recommend enabling minidumps, so I have already checked this.

Current dump-related setup:

AutoReboot: disabled

CrashDumpEnabled: 0x7 / Automatic memory dump

Pagefile: automatically managed by Windows

C: drive free space: around 698GB

C:\Windows\Minidump exists

Even with those settings, after the hard crashes I still get:

No MEMORY.DMP

No minidumps

No LastCrashdump registry entry

So I don’t think this is just a basic “Windows isn’t configured to create dumps” problem.

When it happens

The crashes happen mainly during gaming, not normal desktop use.

Recent testing has been in Warzone, but I have had similar issues in other demanding games before. The annoying part is that synthetic stress tests do not always reproduce it.

The PC can pass stress tests, but still crash in real games.

That makes me wonder if this is related to transient GPU spikes, mixed CPU/GPU load, shader compilation, VRAM load, PCIe behaviour, PSU response, or something else that games trigger better than synthetic tests.

Things I have already tried

GPU drivers

I have already done clean Nvidia driver installs before, including DDU / clean install methods.

It did not properly fix the issue.

BIOS and motherboard drivers

BIOS is updated.

Chipset, Realtek, and other motherboard-related drivers have been updated.

RAM troubleshooting

I have spent a lot of time trying to rule out RAM.

I have tried:

XMP on and off

BIOS defaults

Slower/default RAM speeds

Different RAM stick combinations

Two sticks instead of four

Reseating the RAM

Making sure every stick is fully clicked in

Memory testing

At the moment, XMP is off and the RAM is running at default JEDEC speed, around 2133MHz / 1.2V.

RAM instability still feels possible, especially because I’m using 4 sticks, but I have tested RAM so much that I’m struggling to believe it is the main issue.

CPU/GPU stress testing

I have stress-tested both CPU and GPU.

The system can pass tests, which makes this harder to pin down. The crashes seem more likely in real gaming workloads than simple synthetic stress tests.

GPU power cabling

I checked the GPU power cabling.

The RTX 3080 Ti FE is using the Nvidia 12-pin adapter and is fed by two separate PCIe cables from the PSU. It is not running from one daisy-chained PCIe cable.

So I don’t think this is a basic daisy-chain GPU power cable issue.

Important clue: GPU power limit / underclock seems to help

The most useful clue so far is that lowering GPU load seems to improve stability.

In Warzone, I tested with:

Power Limit: 70%

Core Clock: -100

Memory Clock: -500

With those settings, the game seemed stable and did not crash during the test session.

I then tested a milder profile:

Power Limit: 90%

Core Clock: -50

Memory Clock: -200

That also seemed stable in Warzone during testing.

GPU temperature did not look bad. In MSI Afterburner, GPU max temperature was around:

GPU max temp: about 73°C

So this does not look like a simple overheating issue.

The fact that the system seems stable when the 3080 Ti is power-limited or slightly underclocked makes me wonder whether the issue is:

GPU transient power spikes

PSU stability under real gaming load

GPU boost instability at stock

GPU VRAM instability at stock

12-pin adapter / cable seating

Motherboard / PCIe power delivery

Intel platform BIOS behaviour

Why I suspect PSU, GPU power, or motherboard

The main reason is that Windows is not catching the actual crash.

Also, the problem seemed to start creeping in after I switched platform from AMD to Intel and installed the current ASUS Z790 motherboard. I don’t remember having this kind of issue with my previous AMD board.

I’m not saying the motherboard is definitely faulty, but the timing makes me suspicious of:

Motherboard behaviour

BIOS settings

PCIe behaviour

CPU power behaviour

Intel 13th/14th gen default power settings

PSU/GPU transient response

I’m also aware that “default” BIOS settings on Intel 13th/14th gen are not always truly conservative. I’m trying to confirm whether the board is actually using Intel Default/Baseline limits, or whether ASUS is still applying enhanced turbo / MultiCore Enhancement / unlimited power behaviour.

Older GPU/display-related clue

Older LiveKernelReports / WATCHDOG dumps appear to reference graphics/display components like:

dxgkrnl.sys

nvlddmkm.sys

watchdog.sys

The hard crashes themselves still do not create normal BSOD dumps, but this does make me wonder whether the GPU/display-driver path is involved.

What I think I have mostly ruled out

At this point, I feel like I have mostly ruled out the obvious stuff:

Not obvious overheating

Not obvious XMP instability

Not obvious RAM seating issue

Not obvious daisy-chained GPU power cable issue

Not fixed by clean Nvidia driver install

Not easily reproduced by simple CPU/GPU stress tests

Not a normal Windows BSOD with useful dumps

The strongest clue is still that lowering GPU power/clocks seems to make Warzone stable.

Questions

If Windows creates no dump files and only logs Kernel-Power 41 / EventLog 6008, does that usually point toward a hardware-level shutdown/reset rather than a normal software crash?
Could a PSU pass normal use and stress tests but still fail during real gaming due to GPU transient spikes?
Could an RTX 3080 Ti cause hard freezes/reboots if it is unstable at stock boost clocks, even if temperatures look fine?
With a 1200W Corsair PSU and two separate PCIe cables, would you still suspect PSU/cable/adapter issues, or would you start suspecting the GPU itself?
Could the motherboard cause this kind of no-dump hard crash, especially given the issue started after switching from AMD to Intel/Z790?
Are there specific ASUS Z790 / Intel 13th/14th gen BIOS settings I should check, such as MultiCore Enhancement, Intel Baseline Profile, PL1/PL2 limits, SVID behaviour, etc?
Is there any reliable way to test PSU stability without simply swapping in another known-good PSU?
Would you test a different PSU first, test a different GPU first, or suspect the motherboard first?
Is there anything else I should check that would not show up properly in Windows logs?

Current plan

My current plan is to test Warzone with one change at a time:

Known stable:

Power Limit 90%

Core Clock -50

Memory Clock -200

Next test:

Power Limit 100%

Core Clock -50

Memory Clock -200

Then:

Power Limit 100%

Core Clock 0

Memory Clock -200

Then:

Power Limit 100%

Core Clock 0

Memory Clock 0

My thinking is:

If it crashes when power limit goes back to 100%, that points more toward PSU / transient spikes / power delivery.

If it crashes when core clock goes back to 0, that points more toward GPU core boost instability.

If it crashes when memory clock goes back to 0, that points more toward VRAM/memory instability.

Any advice would be appreciated, especially from people who have dealt with hard crashes where Windows creates no dump files at all.

PC hard freezing/rebooting during games with no dump files or useful logs — possible PSU/motherboard issue? Looking for advice