Bug 96525 (planken) - [HSW] Display Freeze
Summary: [HSW] Display Freeze
Status: CLOSED FIXED
Alias: planken
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: high major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2016-06-14 09:18 UTC by jean.seyral
Modified: 2017-08-31 18:33 UTC (History)
3 users (show)

See Also:
i915 platform: HSW
i915 features: GPU hang


Attachments
Contains crash dump and xrandr --verbose output (464.38 KB, application/x-gzip)
2016-06-14 09:18 UTC, jean.seyral
no flags Details
Additional GPU crash file (375.61 KB, text/plain)
2016-06-16 11:17 UTC, John Trengrove
no flags Details

Description jean.seyral 2016-06-14 09:18:18 UTC
Created attachment 124523 [details]
Contains crash dump and xrandr --verbose output

Display froze.
I could still move the mouse cursor but couldn't have any interaction on opened windows, toolbar, menu...
I did a Ctrl+Alt+F2 to get a console and here is dmesg output :


[drm] stuck on render ring
[drm] GPU HANG: ecode 0:0x00200000, in cubestorm [6409], reason: Ring hung, action: reset
[drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[drm] GPU crash dump saved to /sys/class/drm/card0/error
[drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... render ring idle
SELinux: initialized (dev fuse, type fuse), uses genfs_contexts


The crash dump is attached to the bug.
Thanks for your help.
Comment 1 thzplanken 2016-06-14 12:50:09 UTC
I apologize for not being a linux expert and I apologize for any possible mistakes I make filling this in. 
I experience exactly the same phenomenon. Problems started when upgrading Debian Stretch today to linux 4.6 kernel.
Everything worked fine on 4.5 but today, a few seconds after logging on, graphics freezes. I can no longer open/close/move windows,  but I can still move the mouse cursor and ssh into this machine. It hangs quickly  but also randomly when using different programmes (Chromium, Libreoffice Impress,...)
Restarting the computer has no lasting effect. What does seem to have an effect is to create /etc/X11/xorg.conf.d/20-intel.conf and fill it with
Section "Device"
   Identifier "Intel Graphics"
   Driver     "intel"
   Option "AccelMethod" "uxa"
EndSection,
But this leads to slow graphics on this computer.

EndSection
Dmesg tells me:
...
[  188.115049] [drm] GPU HANG: ecode 8:0:0xfffffffe, in chromium [1241], reason: Ring hung, action: reset
[  188.115055] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  188.115057] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  188.115059] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  188.115061] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  188.115063] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  188.116615] drm/i915: Resetting chip after gpu hang
[  196.144514] [drm] stuck on render ring
[  196.155180] [drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [730], reason: Ring hung, action: reset
[  196.157852] drm/i915: Resetting chip after gpu hang
[  204.144018] [drm] stuck on render ring
[  204.154825] [drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [730], reason: Ring hung, action: reset
[  204.155240] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
...

uname -a
Linux debian 4.6.0-1-amd64 #1 SMP Debian 4.6.1-1 (2016-06-06) x86_64 GNU/Linux

lspci
00:00.0 Host bridge: Intel Corporation Braswell SoC Transaction Router (rev 21)
00:02.0 VGA compatible controller: Intel Corporation Braswell Integrated Graphics Controller (rev 21)
00:13.0 SATA controller: Intel Corporation Braswell SATA Controller (rev 21)
00:14.0 USB controller: Intel Corporation Braswell USB xHCI Host Controller (rev 21)
00:1a.0 Encryption controller: Intel Corporation Braswell Trusted Execution Engine Interface (rev 21)
00:1b.0 Audio device: Intel Corporation Braswell HD Audio Controller (rev 21)
00:1c.0 PCI bridge: Intel Corporation Braswell PCIe Port 1 (rev 21)
00:1c.1 PCI bridge: Intel Corporation Braswell PCIe Port 2 (rev 21)
00:1c.2 PCI bridge: Intel Corporation Braswell PCIe Port 3 (rev 21)
00:1f.0 ISA bridge: Intel Corporation Braswell Platform Controller Unit LPC (rev 21)
00:1f.3 SMBus: Intel Corporation Braswell Platform Controller Unit SMBus (rev 21)
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
03:00.0 Network controller: Ralink corp. RT3090 Wireless 802.11n 1T/1R PCIe
Comment 2 John Trengrove 2016-06-16 11:17:01 UTC
Created attachment 124555 [details]
Additional GPU crash file
Comment 3 John Trengrove 2016-06-16 11:21:34 UTC
I've attached an additional GPU dump and journalctl log below. Having similar problems. Hard freeze which slowly comes back to life.. Jittery mouse cursor when it comes back.

[drm] stuck on render ring
[drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [297], reason: Ring hung, action: reset
[drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[drm] GPU crash dump saved to /sys/class/drm/card0/error
 drm/i915: Resetting chip after gpu hang
[drm] stuck on render ring
[drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [297], reason: Ring hung, action: reset
[drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
drm/i915: Resetting chip after gpu hang
[drm] stuck on render ring
[drm] GPU HANG: ecode 8:0:0xfffffffe, in compton [301], reason: Ring hung, action: reset
Comment 4 yann 2016-09-01 12:02:03 UTC
(In reply to John Trengrove from comment #3)
> I've attached an additional GPU dump and journalctl log below. Having
> similar problems. Hard freeze which slowly comes back to life.. Jittery
> mouse cursor when it comes back.
> 
> [drm] stuck on render ring
> [drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [297], reason: Ring hung,
> action: reset
> [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack,
> including userspace.
> [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI ->
> DRM/Intel
> [drm] drm/i915 developers can then reassign to the right component if it's
> not a kernel issue.
> [drm] The gpu crash dump is required to analyze gpu hangs, so please always
> attach it.
> [drm] GPU crash dump saved to /sys/class/drm/card0/error
>  drm/i915: Resetting chip after gpu hang
> [drm] stuck on render ring
> [drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [297], reason: Ring hung,
> action: reset
> [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
> drm/i915: Resetting chip after gpu hang
> [drm] stuck on render ring
> [drm] GPU HANG: ecode 8:0:0xfffffffe, in compton [301], reason: Ring hung,
> action: reset

John, even if you are facing display freeze, Jean's gpu crash dump shows that this is different issue (for instance IPEHR (Instruction Parser Error Header Register) are different and your issue is not happening in batch likes Jean's one) . So If this issue is still occuring, I suggest to fill a new bug and attached both gpu crash dump & dmesg
Comment 5 yann 2016-09-01 12:10:53 UTC
(In reply to jean.seyral from comment #0)
> Created attachment 124523 [details]
> Contains crash dump and xrandr --verbose output
> 
> Display froze.
> I could still move the mouse cursor but couldn't have any interaction on
> opened windows, toolbar, menu...
> I did a Ctrl+Alt+F2 to get a console and here is dmesg output :
> 
> 
> [drm] stuck on render ring
> [drm] GPU HANG: ecode 0:0x00200000, in cubestorm [6409], reason: Ring hung,
> action: reset
> [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack,
> including userspace.
> [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI ->
> DRM/Intel
> [drm] drm/i915 developers can then reassign to the right component if it's
> not a kernel issue.
> [drm] The gpu crash dump is required to analyze gpu hangs, so please always
> attach it.
> [drm] GPU crash dump saved to /sys/class/drm/card0/error
> [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... render ring
> idle
> SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
> 
> 
> The crash dump is attached to the bug.
> Thanks for your help.

Jean, it looks like render batch is invalid. Could you update your kernel since I am noticing that the one you are using is 2.6.32-573.7.1.el6.x86_64 and I think that there is Red Hat update ? Then if this is still occurring, please attached gpu crash dump and kernel log.
Comment 6 m.numberger 2017-02-20 11:37:34 UTC
Does anybody has news?

I have the same problem.
I have an Lenovo R400 with the GM45 Chipset.
I use Debian Testing and have the problem since Kernel 4.6. With the actual Kernel 4.9 it's the same.
The first 30 - 180s all is ok, the the cursor move but no interaction.

I tested the 3 drivers:
Intel with sna
Intel with uxa
Modesetting

Always the same result
Comment 7 Ricardo 2017-05-09 17:51:03 UTC
Adding tag into "Whiteboard" field - ReadyForDev
The bug still active
*Status is correct
*Platform is included
*Feature is included
*Priority and Severity correctly set
*Logs included
Comment 8 Elizabeth 2017-07-26 19:46:11 UTC
Hello everybody,
Sorry for the long delay. The latest kernel version mentioned on comment #6 is 4.9, and as stated on comments #1,3, and 6 this seems reproducible. And on comment #1 it mentions that it can be triggered by various applications. If possible could you please try to replicate with latest kernel versions (https://www.kernel.org/) or drm-tip (https://cgit.freedesktop.org/drm-tip) and share new crash log and dmesg with drm.debug=0xe parameter on grub, and possible steps to try reproduce?
Thank you.
Comment 9 Elizabeth 2017-08-31 18:33:22 UTC
(In reply to yann from comment #5)
> Jean, it looks like render batch is invalid. Could you update your kernel
> since I am noticing that the one you are using is 2.6.32-573.7.1.el6.x86_64
> and I think that there is Red Hat update ? Then if this is still occurring,
> please attached gpu crash dump and kernel log.
Based on Yann comment and the lack of updates, is assumed that the Red Hat update fixed the problem, so closing this case. If problem arise again, please file a new bug with HW and SW information and relevant logs. Thank you.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.