Bug 98162 - gpu hangs with unigine heaven on drm-next-4.9-wip
Summary: gpu hangs with unigine heaven on drm-next-4.9-wip
Status: RESOLVED DUPLICATE of bug 98905
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-08 09:24 UTC by Christoph Haag
Modified: 2016-12-25 00:13 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg with gpu hang (79.53 KB, text/plain)
2016-10-08 09:24 UTC, Christoph Haag
no flags Details
gpu hang dmesg with counter strike: global offensive (140.59 KB, text/plain)
2016-10-16 18:53 UTC, Christoph Haag
no flags Details
csgo gpu fault and hang on amd-staging-4.7 (186.42 KB, text/plain)
2016-10-18 23:18 UTC, Christoph Haag
no flags Details
wine+nine csgo gpu fault and hang on stock 4.8 (79.19 KB, text/plain)
2016-10-18 23:20 UTC, Christoph Haag
no flags Details
wine+nine csgo gpu fault and hang amd-staging-4.7 (78.30 KB, text/plain)
2016-10-23 13:14 UTC, Christoph Haag
no flags Details

Description Christoph Haag 2016-10-08 09:24:37 UTC
Created attachment 127138 [details]
dmesg with gpu hang

XFX Radeon RX 480 XXX OC, latest mesa git and llvm svn.

Running unigine heaven for a while hangs the gpu like this:

Okt 08 10:31:14 c-l kernel: [drm:amdgpu_job_timedout] *ERROR* ring gfx timeout, last signaled seq=42581, last emitted seq=42583
Okt 08 10:31:14 c-l kernel: [drm] IP block:1 is hang!
Okt 08 10:31:14 c-l kernel: [drm] IP block:5 is hang!

I tried bisecting and landed on 4be051aeb3964146d3922238fff0ed1e4a9656d1 "drm/amd/powerplay: use smu7 hwmgr to manager polaris10/11" but I'm not 100% confident I caught every bad commit because sometimes unigine heaven needs to run for several minutes before the hang happens and this commit is not trivial to revert.
Comment 1 Christoph Haag 2016-10-16 17:14:35 UTC
Does NOT happen with 4.9-rc1 even though this commit is in it.

Either my bisect result is just wrong or it only happens in combination with another commit not yet in 4.9-rc1.
Comment 2 Christoph Haag 2016-10-16 18:53:34 UTC
Created attachment 127334 [details]
gpu hang dmesg with counter strike: global offensive

To document this:
csgo ran fine, although very slow with stock 4.9-rc1. I did not try for very long, so perhaps it would still have happened after a while.

The performance issue was identified to be caused by 87744ab3832b83ba71b931f86f9cfdb000d07da5

After reverting this commit on 4.9-rc1, I still do not get GPU hangs with unigine heaven, but I start seeing GPU hangs with csgo.

Maybe not the same issue, but it doesn't happen on 4.8.
Comment 3 Christoph Haag 2016-10-18 23:18:30 UTC
Created attachment 127394 [details]
csgo gpu fault and hang on amd-staging-4.7

I just don't know anymore. There seems to be a lot of brokenness on Polaris (or maybe just on my XFX Radeon RX 480 XXX OC model) right now.

This is a GPU fault + gpu hang that is caused by the native version of csgo on amd-staging-4.7.
Comment 4 Christoph Haag 2016-10-18 23:20:10 UTC
Created attachment 127395 [details]
wine+nine csgo gpu fault and hang on stock 4.8

This is a GPU fault + hang I got on stock 4.8 when I tried the windows version of csgo with nine. The native version of csgo runs normally on stock 4.8.
Comment 5 Christoph Haag 2016-10-22 11:18:42 UTC
unigine heaven and csgo work fine with amdgpu-staging-4.7 f9c58ccc03147e652284f06053b089eca957e1e1

with drm-next-4.10 (with 87744ab3832b83ba71b931f86f9cfdb000d07da5) reverted for performance, unigine heaven works (I think), but csgo causes gpu hangs.
Comment 6 Christoph Haag 2016-10-23 13:14:32 UTC
Created attachment 127496 [details]
wine+nine csgo gpu fault and hang amd-staging-4.7

Wait, this amd-staging-4.7 revision doesn't work well either. Tried with csgo nine and it fails too.
Comment 7 Christoph Haag 2016-10-30 22:17:24 UTC
On stock 4.8 again and native csgo and csgo with nine run both fine - on auto (high) graphics settings. It appears there is something with the lowest settings that breaks the driver on both the native version and the nine version.

I think I the original bug here is solved, because unigine heaven has been running stable on both amd-staging-4.7 and drm-next-4.9/4.10-wip, and that the csgo and soma bug is something else).
Comment 8 Christoph Haag 2016-12-25 00:13:04 UTC
I'm relatively sure that my bad GPU caused all of this (maybe except the first issue which was soon solved).

*** This bug has been marked as a duplicate of bug 98905 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.