Created attachment 110362 [details]
I was running chromium which was displaying a video from youtube (https://www.youtube.com/watch?v=OnuH-yO0BJE) when i got the gnome frowny-face and my session failed and had to be restarted.
in dmesg, i see:
[662508.816047] [drm] stuck on render ring
[662508.817097] [drm] GPU HANG: ecode 0:0x9f47f9fd, in chromium , reason: Ring hung, action: reset
[662508.817100] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[662508.817103] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[662508.817106] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[662508.817108] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[662508.817111] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[662508.817294] [drm:i915_reset] *ERROR* Failed to reset chip: -19
[662510.644230] chromium: segfault at 1f8 ip 00007fa0e5c40def sp 00007fffc4519960 error 4 in i965_dri.so[7fa0e58eb000+51e000]
[662510.793908] chromium: segfault at 1f8 ip 00007fcda8e43def sp 00007fff39fdc000 error 4 in i965_dri.so[7fcda8aee000+51e000]
[662519.552681] gnome-shell: segfault at 1f8 ip 00007f482100fdef sp 00007fff1b4bad90 error 4 in i965_dri.so[7f4820cba000+51e000]
I'm running linux 3.17 from debian experimental:
Linux frigg 3.17-1-amd64 #1 SMP Debian 3.17-1~exp1 (2014-10-14) x86_64 GNU/Linux
I'm attaching the full dump of /sys/class/drm/card0/error, and i'll follow up with the full dmesg.
Created attachment 110363 [details]
dmesg from crashing machine
I suspect this may be another duplicate of the bug 80568, fixed (worked-around) by this commit:
Author: Kenneth Graunke <firstname.lastname@example.org>
Date: Sat Jan 17 23:21:15 2015 -0800
i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
Gen4 hardware appears to GPU hang frequently when using Chromium, and
also when running 'glmark2 -b ideas'. Most of the error states contain
3DPRIMITIVE commands in quick succession, with very few state packets
between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER.
I trimmed an apitrace of the glmark2 hang down to two draw calls with a
glUniformMatrix4fv call between the two. Either draw by itself works
fine, but together, they hang the GPU. Removing the glUniform call
makes the hangs disappear. In the hardware state, this translates to
removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets.
Flushing before emitting CONSTANT_BUFFER packets also appears to make
the hangs disappear. I observed a slowdown in glxgears by doing it all
the time, so I've chosen to only do it when BRW_NEW_BATCH and
BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or
already flushed the whole pipeline).
I'd much rather understand the problem, but at this point, I don't see
how we'd ever be able to track it down further. We have no real tools,
and the hardware people moved on years ago. I've analyzed 20+ error
states and read every scrap of documentation I could find.
Signed-off-by: Kenneth Graunke <email@example.com>
Acked-by: Matt Turner <firstname.lastname@example.org>
Cc: "10.4 10.3" <email@example.com>
It's in git, and backports are in Mesa 10.4.x for x > 3. Please try upgrading to >10.4.3. If it's resolved by such an upgrade, please mark as a duplicate of bug 80568.
No reply. Marking as duplicate.
*** This bug has been marked as a duplicate of bug 80568 ***
I'm only running mesa 10.4.2 (debian sid has nothing newer) so i wasn't able to test this, sorry. however, trying to view the same video again on the same hardware with chromium doesn't cause a gnome crash any more, so i don't know how to replicate it anyway. thanks for the followup.
(In reply to Daniel Kahn Gillmor from comment #4)
> I'm only running mesa 10.4.2 (debian sid has nothing newer) so i wasn't able
> to test this, sorry. however, trying to view the same video again on the
> same hardware with chromium doesn't cause a gnome crash any more, so i don't
> know how to replicate it anyway. thanks for the followup.
Thanks. You might file a bug with Debian and suggest that they backport the patch to their 10.4.2.