Bug 91518

Summary: [TAHITI] Crash caused by GPU faults while launching Unigine Heaven 4.0
Product: Mesa Reporter: Gustaw Smolarczyk <wielkiegie>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium CC: wielkiegie
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: dmesg with GPU lock-up

Description Gustaw Smolarczyk 2015-08-01 09:07:38 UTC
Hello,

While trying to test new tessellation code in radeonsi using Unigine Heaven, I happened to stumble on this bug.

After first few frames of rendering after the loading screen ends, the system locks itself up. The rendering is almost always broken too. Sometimes the GPU does try to restart, but most of the time the lock-up is hard (the kernel doesn't respond to any keyboard events) and I need to reset.

After the reset, the following messages reside in the kern.log (quite a lot of them):

[  142.631214] radeon 0000:02:00.0: GPU fault detected: 147 0x0a0d4402
[  142.631217] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00130CD0
[  142.631218] radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0D044002
[  142.631220] VM fault (0x02, vmid 6) at page 1248464, write from TC (68)

This could be related to #90266, but the faults don't happen in the background, they almost immediately take the system down.

I am using kernel 4.1.3, mesa and libdrm from git, xorg 1.17.2, ddx 7.5.0.
The problem also did happen just before the tessellation work was done (with the tessellation disabled in the benchmark), using mesa from ~2 weeks ago. The recent Marek's fixes didn't help.

Relevant Heaven settings:
Quality: Ultra
Tessellation: Extreme/Disabled
Anti-Aliasing: 4x/8x
Full Screen: yes
Comment 1 Gustaw Smolarczyk 2015-08-01 09:26:45 UTC
Also, I am using LLVM from git.
Comment 2 Marek Olšák 2015-08-01 10:00:17 UTC
Did you install Mesa's drirc in /etc/drirc?
Comment 3 Gustaw Smolarczyk 2015-08-01 10:06:56 UTC
I think it happens automatically when I am using the gentoo ebuild, since the contents of /etc/drirc match the in the mesa git repository.

But the ~/.drirc contains the old values... Let me check with the local file deleted (or at least the heaven sections deleted).
Comment 4 Gustaw Smolarczyk 2015-08-01 10:20:08 UTC
Created attachment 117479 [details]
dmesg with GPU lock-up

Commenting the old heaven profiles from ~/.drirc didn't help. Or maybe it did change the behavior somewhat.

Attached is the relevant fragment of kern.log. The GPU did restart a few times but eventually the system locked-up.

What's more, when the freeze happened the loading screen was displayed. So the problem happened even earlier then before. It might have happened that way by chance though.
Comment 5 Vladimir Usikov 2016-02-26 04:26:21 UTC
Radeon 7950(TAHITI) llvm-git, mesa-git, kernel 4.4.1

Test pass ok, with all three levels of tessellation(Moderable, Normal, Extrime).

Do this bug still exist?
Comment 6 Marek Olšák 2016-03-08 12:34:33 UTC
This must be fixed, because we've tested Heaven a lot and fixed a bunch of bugs for VI and nonVI. please reopen if you can reproduce this.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.