Bug 91657 - [drm] GPU HANG: ecode 6:1:0xffeffffe, in multiqueue1:src [12807], reason: Ring hung, action: reset
Summary: [drm] GPU HANG: ecode 6:1:0xffeffffe, in multiqueue1:src [12807], reason: Rin...
Status: RESOLVED WONTFIX
Alias: None
Product: libva
Classification: Unclassified
Component: intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: ykzhao
QA Contact: Sean V Kelley
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-08-16 11:00 UTC by Sander Eikelenboom
Modified: 2016-01-06 02:29 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
/sys/class/drm/card0/error output (991.99 KB, text/plain)
2015-08-16 11:00 UTC, Sander Eikelenboom
Details
gpu-error-dump2.gz (663.09 KB, application/x-gzip)
2015-10-31 16:07 UTC, Sander Eikelenboom
Details
lspci.txt (1.40 KB, text/plain)
2015-11-26 23:10 UTC, Sander Eikelenboom
Details
drm-error-1.6.1.txt.gz (664.41 KB, application/x-gzip)
2015-11-26 23:10 UTC, Sander Eikelenboom
Details

Description Sander Eikelenboom 2015-08-16 11:00:11 UTC
Created attachment 117718 [details]
/sys/class/drm/card0/error output

On my x220 thinkpad laptop running on Debian Jessie and linux 4.2-rc6  encountered this bug.

I happens when playing a video in firefox, since gstreamer-vaapi is installed it probably uses that and could be related to that.
Afterwards all videos only play sound with a green video fill.

[ 3128.306151] [drm] stuck on bsd ring
[ 3128.311437] [drm] GPU HANG: ecode 6:1:0xffeffffe, in multiqueue1:src [12807], reason: Ring hung, action: reset
[ 3128.320990] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 3128.329283] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 3128.337299] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 3128.345899] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 3128.353956] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 3128.362157] drm/i915: Resetting chip after gpu hang
[ 3134.331182] [drm] stuck on bsd ring
[ 3134.336097] [drm] GPU HANG: ecode 6:1:0xffeffffe, in multiqueue1:src [12807], reason: Ring hung, action: reset
[ 3134.345288] [drm:i915_set_reset_status] *ERROR* gpu hanging too fast, banning!
[ 3134.353572] drm/i915: Resetting chip after gpu hang


The output of /sys/class/drm/card0/error is attached.

--
Sander
Comment 1 haihao 2015-10-30 04:51:36 UTC
Sorry for slow response, could you provide the link you accessed in firefox ?
Comment 2 Sander Eikelenboom 2015-10-31 16:07:43 UTC
Created attachment 119315 [details]
gpu-error-dump2.gz

On 2015-10-30 05:51, bugzilla-daemon@freedesktop.org wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=91657
> 
> --- Comment #1 from haihao <haihao.xiang@intel.com> ---
> Sorry for slow response, could you provide the link you accessed in 
> firefox ?

Hi,

It seems to happen in about every movie, but just tested again f.e. the 
movie in:
https://tweakers.net/video/11055/lara-reist-de-wereld-af-in-launchtrailer-rise-of-the-tomb-raider.html

It seems to (only) happen when you skip back and forth through the movie 
while playing it, but after the crash it doesn't recover and you only 
get a green screen (with sound).

The crash seemed a little bit different this time so here it is:
[  786.874201] [drm] stuck on render ring
[  786.879854] [drm] GPU HANG: ecode 6:0:0x87e8effd, in MediaPl~back #3 
[5143], reason: Ring hung, action: reset
[  786.890811] [drm] GPU hangs can indicate a bug anywhere in the entire 
gfx stack, including userspace.
[  786.901184] [drm] Please file a _new_ bug report on 
bugs.freedesktop.org against DRI -> DRM/Intel
[  786.911135] [drm] drm/i915 developers can then reassign to the right 
component if it's not a kernel issue.
[  786.915040] [drm] The gpu crash dump is required to analyze gpu 
hangs, so please always attach it.
[  786.917068] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  786.921628] drm/i915: Resetting chip after gpu hang
[  792.872050] [drm] stuck on render ring
[  792.877512] [drm] GPU HANG: ecode 6:0:0x87e8effd, in MediaPl~back #5 
[5354], reason: Ring hung, action: reset
[  792.888571] [drm:i915_set_reset_status] *ERROR* gpu hanging too fast, 
banning!
[  792.898954] drm/i915: Resetting chip after gpu hang


GPU crash dump is also attached.



Kernel in the meantime is 4.3-rc7

Debian Jessie va-api package versions installed are:

ii  gstreamer1.0-vaapi:amd64                                    0.5.9-2  
                                  amd64        VA-API plugins for 
GStreamer
ii  gstreamer1.0-vaapi-doc                                      0.5.9-2  
                                      all          GStreamer VA-API 
documentation and manuals
ii  libgstreamer-vaapi1.0-0:amd64                               0.5.9-2  
                                      amd64        GStreamer libraries 
from the "vaapi" set
ii  libgstreamer-vaapi1.0-dev                                   0.5.9-2  
                                      amd64        GStreamer development 
files for libraries from the "vaapi" set
ii  i965-va-driver:amd64                                        1.4.1-2  
                                      amd64        VAAPI driver for Intel 
G45 & HD Graphics family
ii  libva-dev:amd64                                             1.4.1-1  
                                      amd64        Video Acceleration 
(VA) API for Linux -- development files
ii  libva-drm1:amd64                                            1.4.1-1  
                                      amd64        Video Acceleration 
(VA) API for Linux -- DRM runtime
ii  libva-egl1:amd64                                            1.4.1-1  
                                      amd64        Video Acceleration 
(VA) API for Linux -- EGL runtime
ii  libva-glx1:amd64                                            1.4.1-1  
                                      amd64        Video Acceleration 
(VA) API for Linux -- GLX runtime
ii  libva-tpi1:amd64                                            1.4.1-1  
                                      amd64        Video Acceleration 
(VA) API for Linux -- TPI runtime
ii  libva-wayland1:amd64                                        1.4.1-1  
                                      amd64        Video Acceleration 
(VA) API for Linux -- Wayland runtime
ii  libva-x11-1:amd64                                           1.4.1-1  
                                      amd64        Video Acceleration 
(VA) API for Linux -- X11 runtime
ii  libva1:amd64                                                1.4.1-1  
                                      amd64        Video Acceleration 
(VA) API for Linux -- runtime
Comment 3 ykzhao 2015-11-26 04:23:36 UTC
Hi, 

    Sorry for the late response.
    Is the issue still reproduced by using the latest intel-driver?

    If the issue is reproduced, will you please attach the following info?
    >cat /sys/class/drm/card0/error
    >lspci -vxxx -s 0:02.0

    
    If the issue is reproduced, can you try to disable the PPGTT and see whether it is helpful? (The PPGTT can be disabled by adding the option of "i915.enable_ppgtt=0")

Thanks
Comment 4 Sander Eikelenboom 2015-11-26 23:10:55 UTC
Created attachment 120155 [details]
lspci.txt

On 2015-11-26 05:23, bugzilla-daemon@freedesktop.org wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=91657
> 
> --- Comment #3 from ykzhao <yakui.zhao@intel.com> ---
> Hi,
> 
>     Sorry for the late response.
>     Is the issue still reproduced by using the latest intel-driver?

I have upgraded to 1.6.1-1, which are the latest packages for debian, 
but the issue is
still present.

ii  libva-dev:amd64                                             1.6.1-1  
                                 amd64        Video Acceleration (VA) API 
for Linux -- development files
ii  libva-drm1:amd64                                            1.6.1-1  
                                 amd64        Video Acceleration (VA) API 
for Linux -- DRM runtime
ii  libva-egl1:amd64                                            1.6.1-1  
                                 amd64        Video Acceleration (VA) API 
for Linux -- EGL runtime
ii  libva-glx1:amd64                                            1.6.1-1  
                                 amd64        Video Acceleration (VA) API 
for Linux -- GLX runtime
ii  libva-tpi1:amd64                                            1.6.1-1  
                                 amd64        Video Acceleration (VA) API 
for Linux -- TPI runtime
ii  libva-wayland1:amd64                                        1.6.1-1  
                                 amd64        Video Acceleration (VA) API 
for Linux -- Wayland runtime
ii  libva-x11-1:amd64                                           1.6.1-1  
                                 amd64        Video Acceleration (VA) API 
for Linux -- X11 runtime
ii  libva1:amd64                                                1.6.1-1  
                                 amd64        Video Acceleration (VA) API 
for Linux -- runtime
ii  i965-va-driver:amd64                                        1.6.1-1  
                                 amd64        VAAPI driver for Intel G45 
& HD Graphics family

>     If the issue is reproduced, will you please attach the following 
> info?
>     >cat /sys/class/drm/card0/error
>     >lspci -vxxx -s 0:02.0

Both are attached.

>     If the issue is reproduced, can you try to disable the PPGTT and 
> see
> whether it is helpful? (The PPGTT can be disabled by adding the option 
> of
> "i915.enable_ppgtt=0")
> 
> Thanks

I tried with: "i915.enable_ppgtt=0 i915.enable_rc6=0"
And haven't been able to crash it, so either it's gone, or it's at least 
a lot harder to crash.

--
Sander
Comment 5 Sander Eikelenboom 2015-11-26 23:10:56 UTC
Created attachment 120156 [details]
drm-error-1.6.1.txt.gz
Comment 6 ykzhao 2015-12-11 00:48:21 UTC
Sorry for the late response.

Thanks for your verification that it works for you if the option of "i915.enable_ppgtt=0" is added.

So I suggest that you add the option of "i915.enable_ppgtt=0" to workaround the issue on your machine. As your machine is based on Sandybridge platform and it is a bit older, it is difficult to get the root cause.

Thanks
Comment 7 ykzhao 2016-01-06 02:29:04 UTC
As the issue is gone after the PPGTT is disabled and the machine is quite old, it is not worth spending a lot of efforts on fixing the issue.

In fact we suggest that the PPGTT can be disabled on this machine to workaround the issue.

So this bug will be closed and marked as "WONTFIX".

Thanks


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.