89147 – [BDW] stuck on blitter ring GPU HANG: ecode 2:0x0000000a, in X

Bug 89147 - [BDW] stuck on blitter ring GPU HANG: ecode 2:0x0000000a, in X

Summary: [BDW] stuck on blitter ring GPU HANG: ecode 2:0x0000000a, in X

Status:	CLOSED WONTFIX

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	unspecified
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium normal
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-02-14 15:18 UTC by Harold Naparst
Modified:	2017-02-21 08:05 UTC (History)
CC List:	3 users (show)

See Also:
i915 platform:	BDW
i915 features:	GPU hang

Attachments
/sys/class/drm/card0/error (25 bytes, text/plain) 2015-02-14 16:08 UTC, Harold Naparst	no flags	Details
View All

Description Harold Naparst 2015-02-14 15:18:23 UTC

The crash dump is zero bytes.  Please let me know if other system details would be useful. This is an Intel NUC, with a Broadwell CPU and Intel HD 5500 graphics.  The version of Xorg is 1.17.1, the kernel version is 3.19, and the version of Mesa is 10.4.4.

Feb 14 15:31:48 nuc kernel: [ 4432.007731] [drm] stuck on blitter ring
Feb 14 15:31:48 nuc kernel: [ 4432.008682] [drm] GPU HANG: ecode 2:0x0000000a, in X [14298], reason: Ring hung, action: reset
Feb 14 15:31:48 nuc kernel: [ 4432.008686] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Feb 14 15:31:48 nuc kernel: [ 4432.008688] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Feb 14 15:31:48 nuc kernel: [ 4432.008690] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Feb 14 15:31:48 nuc kernel: [ 4432.008691] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Feb 14 15:31:48 nuc kernel: [ 4432.008693] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Feb 14 15:32:00 nuc kernel: [ 4444.013939] [drm] stuck on blitter ring
Feb 14 15:32:00 nuc kernel: [ 4444.014795] [drm] GPU HANG: ecode 2:0x00000008, in X [14298], reason: Ring hung, action: reset
Feb 14 15:32:00 nuc kernel: [ 4444.014864] [drm:i915_context_is_banned] *ERROR* gpu hanging too fast, banning!

Comment 1 Chris Wilson 2015-02-14 16:04:36 UTC

/sys/class/drm/card0/error is never 0 bytes long. Please 'cat /sys/class/drm/card0/error | bzip2 > error.bz2' and attach.

Comment 2 Harold Naparst 2015-02-14 16:08:25 UTC

Created attachment 113495 [details]
/sys/class/drm/card0/error

Comment 3 Chris Wilson 2015-02-14 17:14:21 UTC

After the GPU hang is reported and before rebooting!

Comment 4 Harold Naparst 2015-02-15 07:53:30 UTC

I'll see if I can recover without rebooting.  Sometimes I can, and sometimes not.  The problem may be related to screen blanking, which can be fixed by using xset s.  In any event, whenever the screen turns off or sleeps, X crashes.  Using UXA or other xorg or boot options does not seem to help. Here is the latest one.  In this case, there was no error at /sys/class/drm/card0/error either.

Feb 15 08:41:24 nuc kernel: [47882.539015] ------------[ cut here ]------------
Feb 15 08:41:24 nuc kernel: [47882.539021] WARNING: CPU: 0 PID: 23486 at drivers/gpu/drm/i915/intel_display.c:1256 assert_plane.constprop.87+0x6b/0x80()
Feb 15 08:41:24 nuc kernel: [47882.539022] plane A assertion failure (expected on, current off)
Feb 15 08:41:24 nuc kernel: [47882.539026] Modules linked in: vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_pcsp x86_pkg_temp_thermal
Feb 15 08:41:24 nuc kernel: [47882.539029] CPU: 0 PID: 23486 Comm: X Tainted: G           O   3.19.0-gentoo #4
Feb 15 08:41:24 nuc kernel: [47882.539033] Hardware name:  /NUC5i3MYBE, BIOS MYBDWi30.86A.0017.2014.1127.1854 11/27/201
Feb 15 08:41:24 nuc kernel: [47882.539035]  ffffffff81bffd20 ffff8800cecb36f8 ffffffff8186a9da 0000000000002828
Feb 15 08:41:24 nuc kernel: [47882.539036]  ffff8800cecb3748 ffff8800cecb3738 ffffffff8104b405 ffff88021ec15f88
Feb 15 08:41:24 nuc kernel: [47882.539037]  0000000000000000 ffff880214ad1000 ffff880214abf800 ffff880214abf800
Feb 15 08:41:24 nuc kernel: [47882.539038] Call Trace:
Feb 15 08:41:24 nuc kernel: [47882.539041]  [<ffffffff8186a9da>] dump_stack+0x45/0x57
Feb 15 08:41:24 nuc kernel: [47882.539044]  [<ffffffff8104b405>] warn_slowpath_common+0x85/0xc0
Feb 15 08:41:24 nuc kernel: [47882.539046]  [<ffffffff8104b481>] warn_slowpath_fmt+0x41/0x50
Feb 15 08:41:24 nuc kernel: [47882.539048]  [<ffffffff8145d4eb>] assert_plane.constprop.87+0x6b/0x80
Feb 15 08:41:24 nuc kernel: [47882.539051]  [<ffffffff814648a9>] hsw_disable_ips+0x39/0x180
Feb 15 08:41:24 nuc kernel: [47882.539053]  [<ffffffff81464c53>] intel_crtc_disable_planes+0x43/0x140
Feb 15 08:41:24 nuc kernel: [47882.539055]  [<ffffffff8146598f>] haswell_crtc_disable+0x4f/0x3a0
Feb 15 08:41:24 nuc kernel: [47882.539057]  [<ffffffff814668f9>] __intel_set_mode+0xa69/0xc90
Feb 15 08:41:24 nuc kernel: [47882.539059]  [<ffffffff8146d51b>] intel_crtc_set_config+0xbfb/0xf70
Feb 15 08:41:24 nuc kernel: [47882.539062]  [<ffffffff813efb36>] drm_mode_set_config_internal+0x66/0x100
Feb 15 08:41:24 nuc kernel: [47882.539063]  [<ffffffff813dd308>] restore_fbdev_mode+0xc8/0xf0
Feb 15 08:41:24 nuc kernel: [47882.539065]  [<ffffffff813df314>] drm_fb_helper_restore_fbdev_mode_unlocked+0x24/0x70
Feb 15 08:41:24 nuc kernel: [47882.539067]  [<ffffffff813df37d>] drm_fb_helper_set_par+0x1d/0x40
Feb 15 08:41:24 nuc kernel: [47882.539069]  [<ffffffff8147a245>] intel_fbdev_set_par+0x15/0x60
Feb 15 08:41:24 nuc kernel: [47882.539071]  [<ffffffff813476cf>] fb_set_var+0x19f/0x410
Feb 15 08:41:24 nuc kernel: [47882.539074]  [<ffffffff8106f4b9>] ? check_preempt_curr+0x89/0xa0
Feb 15 08:41:24 nuc kernel: [47882.539075]  [<ffffffff8106f4e8>] ? ttwu_do_wakeup+0x18/0xe0
Feb 15 08:41:24 nuc kernel: [47882.539077]  [<ffffffff81341736>] fbcon_blank+0x226/0x300
Feb 15 08:41:24 nuc kernel: [47882.539079]  [<ffffffff813a39a5>] do_unblank_screen+0xb5/0x1e0
Feb 15 08:41:24 nuc kernel: [47882.539081]  [<ffffffff81399bb8>] complete_change_console+0x58/0xe0
Feb 15 08:41:24 nuc kernel: [47882.539083]  [<ffffffff8139adc0>] vt_ioctl+0x1180/0x1430
Feb 15 08:41:24 nuc kernel: [47882.539087]  [<ffffffff8113657b>] ? handle_mm_fault+0x73b/0xa50
Feb 15 08:41:24 nuc kernel: [47882.539089]  [<ffffffff8138d7c1>] tty_ioctl+0x401/0xc70
Feb 15 08:41:24 nuc kernel: [47882.539092]  [<ffffffff8103f8d5>] ? __do_page_fault+0x1e5/0x570
Feb 15 08:41:24 nuc kernel: [47882.539094]  [<ffffffff81173ee0>] do_vfs_ioctl+0x2e0/0x4e0
Feb 15 08:41:24 nuc kernel: [47882.539096]  [<ffffffff812a60a7>] ? file_has_perm+0x87/0xa0
Feb 15 08:41:24 nuc kernel: [47882.539098]  [<ffffffff81160482>] ? vfs_write+0x1b2/0x1f0
Feb 15 08:41:24 nuc kernel: [47882.539100]  [<ffffffff81174161>] SyS_ioctl+0x81/0xa0
Feb 15 08:41:24 nuc kernel: [47882.539102]  [<ffffffff8103fc6c>] ? do_page_fault+0xc/0x10
Feb 15 08:41:24 nuc kernel: [47882.539103]  [<ffffffff81872e92>] system_call_fastpath+0x12/0x17
Feb 15 08:41:24 nuc kernel: [47882.539104] ---[ end trace 4ccbee195f5d9637 ]---
Feb 15 08:41:24 nuc kernel: [47882.603884] [drm:intel_dp_start_link_train] *ERROR* failed to enable link training
Feb 15 08:41:24 nuc kernel: [47882.606555] [drm:intel_dp_complete_link_train] *ERROR* failed to start channel equalization

Comment 5 Chris Wilson 2015-02-15 09:36:41 UTC

(In reply to Harold Naparst from comment #4)
> I'll see if I can recover without rebooting.  Sometimes I can, and sometimes
> not.  The problem may be related to screen blanking, which can be fixed by
> using xset s.  In any event, whenever the screen turns off or sleeps, X
> crashes. 

Not quite, it is just the kernel failing to set up the link to the display. If you try another modeset it may recover. Anyway, it is a different bug to the GPU hang, please do file a separate report. Here drm.debug=6 is essential to trying to work out why the modesetting failed.

Comment 6 Ricardo 2017-02-21 01:01:48 UTC

The bug will be closed since has not been a response for several months now, if the problem persist please open a new bug with a updated information

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.