Bug 103973

Summary: GPU Hang and crash in CS GO and Half Life (OpenGL games)
Product: Mesa Reporter: nicolaspok
Component: Drivers/DRI/i965Assignee: Intel 3D Bugs Mailing List <intel-3d-bugs>
Status: RESOLVED DUPLICATE QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: 17.2   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: SKL i915 features: GPU hang
Attachments: GPU crash dump
xrandr --verbose

Description nicolaspok 2017-11-29 17:30:37 UTC
Created attachment 135803 [details]
GPU crash dump

Hello,

I am consistently (100% of the time) experiencing a freeze after a few seconds (always the same amount of time) when playing some Steam games. The game freezes and ends up crashing a few seconds later.

dmesg returns:
[ 1557.000432] [drm] GPU HANG: ecode 9:0:0x85dffffb, in csgo_linux64 [5572], reason: Hang on rcs0, action: reset
[ 1557.000433] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 1557.000434] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 1557.000434] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 1557.000435] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 1557.000436] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 1557.000473] i915 0000:00:02.0: Resetting chip after gpu hang
[ 1557.000716] [drm] RC6 off
[ 1557.004497] [drm] GuC submission enabled (firmware i915/skl_guc_ver6_1.bin [version 6.1])
[ 1564.852419] i915 0000:00:02.0: Resetting chip after gpu hang
[ 1564.852590] [drm] RC6 off
[ 1564.857194] [drm] GuC submission enabled (firmware i915/skl_guc_ver6_1.bin [version 6.1])
[ 1572.961630] i915 0000:00:02.0: Resetting chip after gpu hang
[ 1572.961782] [drm] RC6 off
[ 1572.966833] [drm] GuC submission enabled (firmware i915/skl_guc_ver6_1.bin [version 6.1])
[ 1580.854178] i915 0000:00:02.0: Resetting chip after gpu hang
[ 1580.854348] [drm] RC6 off
[ 1580.858229] [drm] GuC submission enabled (firmware i915/skl_guc_ver6_1.bin [version 6.1])
[ 1583.840878] asynchronous wait on fence i915:gnome-shell[677]/1:44ae timed out
[ 1588.961251] i915 0000:00:02.0: Resetting chip after gpu hang
[ 1588.961407] [drm] RC6 off
[ 1588.964537] [drm] GuC submission enabled (firmware i915/skl_guc_ver6_1.bin [version 6.1])
[ 1589.175268] csgo_linux64[5606]: segfault at 0 ip 00007fe68304eb2e sp 00007fe626889000 error 6 in libtier0_client.so[7fe68303c000+29000]
[ 1589.213491] csgo_linux64[5656]: segfault at 38 ip 00007fe66975dd59 sp 00007fe606ec1670 error 6 in client_client.so[7fe668afa000+17b1000]
[ 1589.251668] csgo_linux64[5606]: segfault at 0 ip 00007fe68304eb2e sp 00007fe626889000 error 6 in libtier0_client.so[7fe68303c000+29000]


I am currently using a custom 4.14.2.1 based Archlinux kernel for the surface pro 4, but the problem also occurred with other (stock) kernel versions, like 4.9 and 4.12.
I am running mesa-17.2.5-1, xf86-video-intel-1:2.99.917+800+g37a682aa-1
I am using the surface pro native display

What I tried: changing (updating kernel versions), switching on and off RC6.
I will try the modesetting driver (whatever this means)

The same games run fine under Windows on the same hardware (dual boot).
Comment 1 nicolaspok 2017-11-29 17:32:08 UTC
Created attachment 135804 [details]
xrandr --verbose
Comment 2 Elizabeth 2017-11-30 18:13:26 UTC
This seems to be a Mesa bug. Do you have any specific steps to reproduce?
Comment 3 nicolaspok 2017-11-30 20:35:06 UTC
Well for me it is just a matter of launching cs_golinux from steam under the conditions described, and it occurs every time, but I understand it would be easier with another program.

I will try to find an easy procedure with another program.
Comment 4 nicolaspok 2017-12-01 12:25:15 UTC
Ok so I couldn't find a way to reproduce easily with another program.
I tested a couple more GPU intensive tasks like more games, and gputest, but I did not reproduce the case. It persists in csgo_linux and half_life.

Can you please clarify what you need to progress? Some additional info/logs? Or a way to reproduce the bug yourself?
Thanks
Comment 5 Mark Janes 2017-12-01 12:43:06 UTC
Possible duplicate:

https://bugs.freedesktop.org/show_bug.cgi?id=102435
Comment 6 nicolaspok 2017-12-01 12:53:15 UTC
Indeed, I tried enabling Aliasing Mode to 2xMSAA, as proposed in the related bug, and that fixed the hang/crash right away. 
So it seem to be related to Aliasing Mode.

Should I close this as a dupe ?
Comment 7 Mark Janes 2017-12-01 16:30:58 UTC

*** This bug has been marked as a duplicate of bug 102435 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.