Bug 104044 - [snb] GPU hang in gnome-shell
Summary: [snb] GPU hang in gnome-shell
Status: NEEDINFO
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-03 05:54 UTC by Pavlo
Modified: 2017-12-13 22:40 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
error file mentioned in dmesg output (48.66 KB, text/plain)
2017-12-03 05:54 UTC, Pavlo
Details
same bug different error log. (29.17 KB, text/plain)
2017-12-03 14:40 UTC, Pavlo
Details
same bug different error log. (29.18 KB, text/plain)
2017-12-03 22:30 UTC, Pavlo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Pavlo 2017-12-03 05:54:15 UTC
Created attachment 135888 [details]
error file mentioned in dmesg output

Symptom : everything freezes and CPU fan start running full speed. 
Hardware reset require to restart computer.

.....
[ 2245.973299] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 2245.973301] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 2245.973302] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 2245.973304] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 2245.973307] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 2245.973312] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 2245.976298] CPU2: Core temperature/speed normal
[ 2245.976299] CPU0: Package temperature/speed normal
[ 2245.976300] CPU1: Package temperature/speed normal
[ 2245.976301] CPU3: Core temperature/speed normal
[ 2245.976302] CPU3: Package temperature/speed normal
[ 2245.976303] CPU2: Package temperature/speed normal
[ 2802.661511] [drm] GPU HANG: ecode 6:0:0x85fffffc, in gnome-shell [1681], reason: Hang on rcs0, action: reset
[ 2802.661512] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 2802.661513] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 2802.661514] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 2802.661514] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 2802.661515] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 2802.661581] drm/i915: Resetting chip after gpu hang
[ 2805.664505] asynchronous wait on fence i915:[global]:e21d2 timed out
[ 2810.656476] drm/i915: Resetting chip after gpu hang
Comment 1 Pavlo 2017-12-03 14:40:54 UTC
Created attachment 135904 [details]
same bug different error log.
Comment 2 Pavlo 2017-12-03 14:42:16 UTC
Happen again: dmesg output

[ 5407.536107] [drm] GPU HANG: ecode 6:0:0x85fffffc, in gnome-shell [1667], reason: Hang on rcs0, action: reset
[ 5407.536109] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 5407.536109] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 5407.536109] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 5407.536110] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 5407.536110] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 5407.536165] drm/i915: Resetting chip after gpu hang
[ 6034.197265] do_trap: 7 callbacks suppressed
[ 6034.197271] traps: pool[9483] trap int3 ip:7f9e203467b1 sp:7f9dffffe8c0 error:0 in libglib-2.0.so.0.5400.2[7f9e202f6000+111000]
[ 6125.467977] drm/i915: Resetting chip after gpu hang
Comment 3 Pavlo 2017-12-03 22:30:40 UTC
Created attachment 135913 [details]
same bug different error log.
Comment 4 Pavlo 2017-12-03 22:31:33 UTC
One more.
....
[ 5407.536107] [drm] GPU HANG: ecode 6:0:0x85fffffc, in gnome-shell [1667], reason: Hang on rcs0, action: reset
[ 5407.536109] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 5407.536109] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 5407.536109] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 5407.536110] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 5407.536110] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 5407.536165] drm/i915: Resetting chip after gpu hang
[ 6034.197265] do_trap: 7 callbacks suppressed
[ 6034.197271] traps: pool[9483] trap int3 ip:7f9e203467b1 sp:7f9dffffe8c0 error:0 in libglib-2.0.so.0.5400.2[7f9e202f6000+111000]
[ 6125.467977] drm/i915: Resetting chip after gpu hang
Comment 5 Elizabeth 2017-12-08 18:09:36 UTC
Hello Pavlo,
Please share your distro, Mesa version, Xorg version, SNA or modesetting?, displays, steps to reproduce, and if this only happens on gnome.
Thank you.
Comment 6 Pavlo 2017-12-09 21:00:12 UTC
(In reply to Elizabeth from comment #5)
> Hello Pavlo,
> Please share your distro, Mesa version, Xorg version, SNA or modesetting?,
> displays, steps to reproduce, and if this only happens on gnome.
> Thank you.

Hello Elizabeth.

Distro   : Fedora 27
Mesa     : Mesa 17.2.4

SNA..    : not sure Fedora is using Wayland so I guess it is kernel modesetting.

Displays : Two. One build in (it is a notebook Sony VPCSA 13 inch) 1600x900 + 
           Second screen  1680X1050.

Cards    : Notebook has two cards ATI + Intel.
00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0126] (rev 09)
	Subsystem: Sony Corporation Device [104d:907b]
	Kernel driver in use: i915
--
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Whistler [Radeon HD 6630M/6650M/6750M/7670M/7690M] [1002:6741] (rev ff)
	Kernel driver in use: radeon
	Kernel modules: radeon

Step to reproduce: Happens randomly but will happen faster if: Video is playing (youtube etc), libvirt virtual machine is running (CentOS or Windows), etc. so I am thinking any "heavy" graphic output will trigger it. There is no specific steps.

Gnome  : Using Gnome exclusively. (I am not changing default Fedora setting what it is coming with I am using. No additional tweaking or anything else just default config)

Note : When I used vfio-pci plug for AMD card to pass-through it to windows WM.
Notebook would freeze completely (no recovery except power reboot) with dual graphic drivers in linux, notebook manage to recover from freezes.

Regards Pavlo.
Comment 7 Elizabeth 2017-12-13 22:40:20 UTC
Just to be sure it haven't been fixed yet, could you please try new 17.3 release for mesa?


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.