85689 – [HSW] stuck on render ring, 3.17

Bug 85689 - [HSW] stuck on render ring, 3.17

Summary: [HSW] stuck on render ring, 3.17

Status:	CLOSED WORKSFORME

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	XOrg git
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	high normal
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-10-31 08:43 UTC by Karol Herbst
Modified:	2017-07-24 22:50 UTC (History)
CC List:	2 users (show)

See Also:
i915 platform:
i915 features:

Attachments
/sys/class/drm/card0/error (67.03 KB, text/plain) 2014-10-31 10:32 UTC, Karol Herbst	no flags	Details
sys/class/drm/card0/error.tar.xz (67.03 KB, application/x-xz) 2014-10-31 10:34 UTC, Karol Herbst	no flags	Details
/sys/class/drm/card0/error (69.89 KB, application/x-xz) 2014-11-07 13:19 UTC, Karol Herbst	no flags	Details
/sys/class/drm/card0/error (62.96 KB, application/x-xz) 2014-11-11 16:44 UTC, Karol Herbst	no flags	Details
/sys/class/drm/card0/error (70.89 KB, application/x-xz) 2014-11-14 11:18 UTC, Karol Herbst	no flags	Details
another one (62.39 KB, application/x-xz) 2014-11-19 14:36 UTC, Karol Herbst	no flags	Details
another strange one today (74.00 KB, application/x-xz) 2014-11-20 05:11 UTC, Karol Herbst	no flags	Details
Show Obsolete (1) View All

Description Karol Herbst 2014-10-31 08:43:28 UTC

dmesg output:

[ 2086.402678] [drm] stuck on render ring
[ 2086.403370] [drm] GPU HANG: ecode 0:0x00200000, in kwin_x11 [2320], reason: Ring hung, action: reset
[ 2086.403371] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 2086.403372] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 2086.403372] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 2086.403372] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 2086.403373] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 2086.403425] [drm:intel_pipe_set_base] *ERROR* pin & fence failed
[ 2105.418015] [drm] stuck on render ring
[ 2105.418751] [drm] GPU HANG: ecode 0:0x00200000, in kwin_x11 [2320], reason: Ring hung, action: reset

the X server was nearly completly frozen after this error, but restarting kwin_x11 from tty helped.

software I am using:

linux-3.17.1
xorg-server-1.16.1
xf86-video-intel-2.99.916
libdrm-2.4.58
mesa-git-1a17098

Comment 1 Chris Wilson 2014-10-31 08:48:38 UTC

Please attach /sys/class/drm/card0/error

Comment 2 Karol Herbst 2014-10-31 10:32:31 UTC

Created attachment 108724 [details]
/sys/class/drm/card0/error

I though I've added it :/ here it is. I see now why: the file was just toio big and I guess I didn't got any error because I've created the bug.

Comment 3 Karol Herbst 2014-10-31 10:34:14 UTC

Created attachment 108725 [details]
sys/class/drm/card0/error.tar.xz

wrong type choosen in bugzilla

Comment 4 Mika Kuoppala 2014-11-03 15:54:53 UTC

It seems the batch is just full of 0xFFFFFFFF's

Did this happen right after resuming on suspend?

Comment 5 Karol Herbst 2014-11-06 15:53:30 UTC

no, I think I played something via bumblebee/primus before that, not sure though.

Comment 6 Karol Herbst 2014-11-07 13:19:25 UTC

Created attachment 109083 [details]
/sys/class/drm/card0/error

got a similar error again:

[ 1492.924345] [drm] stuck on render ring
[ 1492.925226] [drm] GPU HANG: ecode 0:0x85dffff8, in steam [4115], reason: Ring hung, action: reset
[ 1492.925228] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 1492.925229] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 1492.925229] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 1492.925230] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 1492.925230] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 1498.925206] [drm] stuck on render ring
[ 1498.926521] [drm] GPU HANG: ecode 0:0x85dffff8, in steam [4115], reason: Ring hung, action: reset


now I just wanted to maximize steam.

Comment 7 Karol Herbst 2014-11-11 16:44:12 UTC

Created attachment 109288 [details]
/sys/class/drm/card0/error

and again, this time I wanted to maximize chromium. After the hang chromium wasn't updating its content.

[123027.971860] [drm] stuck on render ring
[123027.972737] [drm] GPU HANG: ecode 0:0x00208366, in kwin_x11 [2376], reason: Ring hung, action: reset

Comment 8 Karol Herbst 2014-11-14 11:18:20 UTC

Created attachment 109455 [details]
/sys/class/drm/card0/error

and because this bug annoys me, here is another error state, which came up today.

[25186.039463] [drm] stuck on render ring
[25186.040310] [drm] GPU HANG: ecode 0:0x002883e6, in kwin_x11 [4096], reason: Ring hung, action: reset
[25186.040312] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[25186.040312] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[25186.040313] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[25186.040313] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[25186.040314] [drm] GPU crash dump saved to /sys/class/drm/card0/error

Could it be, that this is more like a kwin issue?

Comment 9 Karol Herbst 2014-11-19 14:36:55 UTC

Created attachment 109733 [details]
another one

Comment 10 Karol Herbst 2014-11-19 18:33:46 UTC

some timer after the hang I also get stuff like that before I have to reboot:

Nov 19 19:25:31 [kernel] ------------[ cut here ]------------
Nov 19 19:25:31 [kernel] WARNING: CPU: 0 PID: 3848 at drivers/gpu/drm/i915/intel_display.c:3361 intel_crtc_wait_for_pending_flips+0x16f/0x180()
Nov 19 19:25:31 [kernel] Modules linked in: nvidia(PO) ttm bbswitch(O) iwldvm tuxedo_wmi(O) btusb [last unloaded: nouveau]
Nov 19 19:25:31 [kernel] CPU: 0 PID: 3848 Comm: X Tainted: P           O   3.17.3-gentoo #2
Nov 19 19:25:31 [kernel] Hardware name: Notebook                         P15SM                          /P15SM                          , BIOS 1.03.04PM v2 03/12/2014
Nov 19 19:25:31 [kernel]  0000000000000009 ffffffff816dd405 0000000000000000 ffffffff810a3e5d
Nov 19 19:25:31 [kernel]  0000000000000000 ffff88041d1b9000 ffff88041cfa8120 ffff88041cf39000
Nov 19 19:25:31 [kernel]  0000000000000000 ffffffff813e7bcf ffff880000000000 ffff8800cb2557c0
Nov 19 19:25:31 [kernel] Call Trace:
Nov 19 19:25:31 [kernel]  [<ffffffff816dd405>] ? dump_stack+0x49/0x6a
Nov 19 19:25:31 [kernel]  [<ffffffff810a3e5d>] ? warn_slowpath_common+0x6d/0x90
Nov 19 19:25:31 [kernel]  [<ffffffff813e7bcf>] ? intel_crtc_wait_for_pending_flips+0x16f/0x180
Nov 19 19:25:31 [kernel]  [<ffffffff810ce5c0>] ? __wake_up_sync+0x10/0x10
Nov 19 19:25:31 [kernel]  [<ffffffff813e7c80>] ? intel_primary_plane_disable+0xa0/0xd0
Nov 19 19:25:31 [kernel]  [<ffffffff8138b60b>] ? setplane_internal+0x1ab/0x310
Nov 19 19:25:31 [kernel]  [<ffffffff8138e96d>] ? drm_mode_setplane+0x10d/0x1d0
Nov 19 19:25:31 [kernel]  [<ffffffff81381d6f>] ? drm_ioctl+0x1bf/0x590
Nov 19 19:25:31 [kernel]  [<ffffffff8115d51f>] ? new_sync_write+0x6f/0xa0
Nov 19 19:25:31 [kernel]  [<ffffffff8119283c>] ? fsnotify+0x23c/0x300
Nov 19 19:25:31 [kernel]  [<ffffffff8116df37>] ? do_vfs_ioctl+0x2d7/0x4b0
Nov 19 19:25:31 [kernel]  [<ffffffff8115fc28>] ? __sb_end_write+0x28/0x60
Nov 19 19:25:31 [kernel]  [<ffffffff8115dbc3>] ? vfs_write+0x123/0x1c0
Nov 19 19:25:31 [kernel]  [<ffffffff81177363>] ? __fget+0x63/0xa0
Nov 19 19:25:31 [kernel]  [<ffffffff8116e146>] ? SyS_ioctl+0x36/0x80
Nov 19 19:25:31 [kernel]  [<ffffffff816e43d2>] ? system_call_fastpath+0x16/0x1b
Nov 19 19:25:31 [kernel] ---[ end trace 7165c092cda9b3a9 ]---
Nov 19 19:25:31 [kernel] [drm:intel_pipe_set_base] *ERROR* pipe is still busy with an old pageflip
Nov 19 19:26:34 [kernel] ------------[ cut here ]------------
Nov 19 19:26:34 [kernel] WARNING: CPU: 3 PID: 3848 at drivers/gpu/drm/i915/intel_display.c:3361 intel_crtc_wait_for_pending_flips+0x16f/0x180()
Nov 19 19:26:34 [kernel] Modules linked in: nvidia(PO) ttm bbswitch(O) iwldvm tuxedo_wmi(O) btusb [last unloaded: nouveau]
Nov 19 19:26:34 [kernel] CPU: 3 PID: 3848 Comm: X Tainted: P        W  O   3.17.3-gentoo #2
Nov 19 19:26:34 [kernel] Hardware name: Notebook                         P15SM                          /P15SM                          , BIOS 1.03.04PM v2 03/12/2014
Nov 19 19:26:34 [kernel]  0000000000000009 ffffffff816dd405 0000000000000000 ffffffff810a3e5d
Nov 19 19:26:34 [kernel]  0000000000000000 ffff88041d1b9000 ffff88041cfa8120 ffff88041cf39000
Nov 19 19:26:34 [kernel]  ffff88041cf39338 ffffffff813e7bcf 0000000000000000 ffff8800cb2557c0
Nov 19 19:26:34 [kernel] Call Trace:
Nov 19 19:26:34 [kernel]  [<ffffffff816dd405>] ? dump_stack+0x49/0x6a
Nov 19 19:26:34 [kernel]  [<ffffffff810a3e5d>] ? warn_slowpath_common+0x6d/0x90
Nov 19 19:26:34 [kernel]  [<ffffffff813e7bcf>] ? intel_crtc_wait_for_pending_flips+0x16f/0x180
Nov 19 19:26:34 [kernel]  [<ffffffff810ce5c0>] ? __wake_up_sync+0x10/0x10
Nov 19 19:26:34 [kernel]  [<ffffffff813f2d77>] ? intel_crtc_set_config+0x9c7/0xe80
Nov 19 19:26:34 [kernel]  [<ffffffff8138afac>] ? drm_mode_set_config_internal+0x5c/0xe0
Nov 19 19:26:34 [kernel]  [<ffffffff8138eb00>] ? drm_mode_setcrtc+0xd0/0x570
Nov 19 19:26:34 [kernel]  [<ffffffff81381d6f>] ? drm_ioctl+0x1bf/0x590
Nov 19 19:26:34 [kernel]  [<ffffffff8116df37>] ? do_vfs_ioctl+0x2d7/0x4b0
Nov 19 19:26:34 [kernel]  [<ffffffff8115fc28>] ? __sb_end_write+0x28/0x60
Nov 19 19:26:34 [kernel]  [<ffffffff8115dbc3>] ? vfs_write+0x123/0x1c0
Nov 19 19:26:34 [kernel]  [<ffffffff81177363>] ? __fget+0x63/0xa0
Nov 19 19:26:34 [kernel]  [<ffffffff8116e146>] ? SyS_ioctl+0x36/0x80
Nov 19 19:26:34 [kernel]  [<ffffffff816e43d2>] ? system_call_fastpath+0x16/0x1b
Nov 19 19:26:34 [kernel] ---[ end trace 7165c092cda9b3aa ]---
Nov 19 19:26:34 [kernel] [drm:intel_pipe_set_base] *ERROR* pipe is still busy with an old pageflip

Comment 11 Karol Herbst 2014-11-20 05:11:40 UTC

Created attachment 109750 [details]
another strange one today

this was strange, just wanted to port some wallets to kwallets5, so I pressed "Next", so this happend:

[20727.430513] [drm] stuck on render ring
[20727.432079] [drm] GPU HANG: ecode 0:0x87d7bffa, in kwin_x11 [1867], reason: Ring hung, action: reset
[20727.432086] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[20727.432087] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[20727.432088] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[20727.432090] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[20727.432091] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[20733.447306] [drm] stuck on render ring
[20733.448224] [drm] GPU HANG: ecode 0:0x85dffffd, in X [1724], reason: Ring hung, action: reset
[20739.448176] [drm] stuck on render ring
[20739.449445] [drm] GPU HANG: ecode 0:0x85dffffd, in X [1724], reason: Ring hung, action: reset
[20739.449530] [drm:i915_context_is_banned] *ERROR* gpu hanging too fast, banning!
[20745.444996] [drm] stuck on render ring
[20745.445909] [drm] GPU HANG: ecode 0:0x87d7bffa, in kwin_x11 [1867], reason: Ring hung, action: reset
[20751.449836] [drm] stuck on render ring
[20751.450797] [drm] GPU HANG: ecode 0:0x87d7bffa, in kwin_x11 [1867], reason: Ring hung, action: reset

Comment 12 Chris Wilson 2014-12-21 21:10:54 UTC

I have no reason to believe this is anything bug a mesa bug, but for sanity's sake could you please first check with drm-intel-nightly?

Comment 13 Karol Herbst 2015-01-07 17:30:42 UTC

the issue stopped to occur for me now. But I am now on linux-3.18.1 and recent mesa-master.

Comment 14 Chris Wilson 2015-01-07 17:43:14 UTC

Ok, let's hope it is fixed...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.