Bug 104764 - [SKL] GPU HANG: ecode 9:0:0x85ddfffb, in vlc [2828], reason: Hang on rcs0, action: reset
Summary: [SKL] GPU HANG: ecode 9:0:0x85ddfffb, in vlc [2828], reason: Hang on rcs0, ac...
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-24 09:58 UTC by Björn Stenberg
Modified: 2019-09-25 19:07 UTC (History)
2 users (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
GPU crash dump (26.14 KB, text/plain)
2018-01-24 09:58 UTC, Björn Stenberg
Details
Test clip that reproduces the GPU hang for me (1.58 MB, video/quicktime)
2018-01-26 11:08 UTC, Björn Stenberg
Details

Description Björn Stenberg 2018-01-24 09:58:35 UTC
Created attachment 136934 [details]
GPU crash dump

I'm just following instructions here. :)

an 24 09:39:47 uno kernel: [  145.846144] [drm] GPU HANG: ecode 9:0:0x85ddfffb, in vlc [2828], reason: Hang on rcs0, action: reset
Jan 24 09:39:47 uno kernel: [  145.846147] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jan 24 09:39:47 uno kernel: [  145.846149] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jan 24 09:39:47 uno kernel: [  145.846151] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jan 24 09:39:47 uno kernel: [  145.846152] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jan 24 09:39:47 uno kernel: [  145.846155] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jan 24 09:39:47 uno kernel: [  145.846167] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 24 09:39:55 uno kernel: [  153.789546] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 24 09:40:03 uno kernel: [  161.822289] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 24 09:40:17 uno kernel: [  175.806207] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 24 09:40:25 uno kernel: [  183.806330] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 24 09:40:33 uno kernel: [  191.810441] i915 0000:00:02.0: Resetting rcs0 after gpu hang

I'm running these debian packages from "testing":
 linux-image-4.14.0-3-amd64 version 4.14.13-1
 xorg version 1:7.7+19
 xserver-xorg-video-intel version 2:2.99.917+git20171
 libdrm-intel1:amd64 version 2:2.99.917+git20171

Any other relevant packages?

The hang was reliably repeated by simply starting playing a video in vlc (version 3.0.0~rc6-1+b1). The same video played without problems in mplayer (2:1.3.0-7+b3). 

I tried disabing rc6 and rebooting but the hang remains.
Comment 1 Elizabeth 2018-01-24 18:23:24 UTC
Could you share your mesa version?
Comment 2 Björn Stenberg 2018-01-25 08:28:27 UTC
$ glxinfo | grep Mesa
client glx vendor string: Mesa Project and SGI
    Device: Mesa DRI Intel(R) HD Graphics 520 (Skylake GT2)  (0x1916)
OpenGL renderer string: Mesa DRI Intel(R) HD Graphics 520 (Skylake GT2) 
OpenGL core profile version string: 4.5 (Core Profile) Mesa 17.2.5
OpenGL version string: 3.0 Mesa 17.2.5
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 17.2.5
Comment 3 Mark Janes 2018-01-25 16:54:18 UTC
If you have any pointers on how we could reproduce this, it would help.  I tried resizing vlc and flipping back and forth to fullscreen, but could not reproduce.

For example, if there is a specific video that causes hang, what is the encoding?
Comment 4 Björn Stenberg 2018-01-26 11:08:52 UTC
Created attachment 136972 [details]
Test clip that reproduces the GPU hang for me

The original crashing file is 3.2GB but I managed to create a smaller clip that also reproduces it. This file was created by running "ffmpeg -i MINI0001.MOV -t 2 -acodec copy -vcodec copy test2s.mov" and triggers the same gpu hang for me.

I start vlc from a terminal: vlc test2s.mov. This brings up the vlc gui but no frame of the video is ever displayed. After this I need to ctrl-alt-1 out to a console and kill -9 vlc to regain a working desktop.
Comment 5 Björn Stenberg 2018-01-26 11:36:34 UTC
After some more structured testing I found more results. My clip is not special.

These files trigger the problem:
http://techslides.com/demos/samples/sample.mov
http://techslides.com/demos/samples/sample.mp4
http://techslides.com/demos/samples/sample.mkv

These files DO NOT trigger the problem:
http://techslides.com/demos/samples/sample.avi
http://techslides.com/demos/samples/sample.flv
http://techslides.com/demos/samples/sample.mov
http://techslides.com/demos/samples/sample.mpg
http://techslides.com/demos/samples/sample.swf
http://techslides.com/demos/samples/sample.webm
http://techslides.com/demos/samples/sample.wmv

I also noted that for the files that trigger the problem, vlc prints this at the start:
libva info: VA-API version 1.0.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva info: va_openDriver() returns 0

When I play the working files, these lines are not shown.

I'm guessing these version numbers are relevant:
 i965-va-driver version 2.0.0+dfsg1-1
 libva version 2.0.0.-2
Comment 6 Björn Stenberg 2018-01-26 12:13:51 UTC
Version update: I just got Mesa 17.3.3 from apt dist-upgrade. The hang still occurs.
Comment 7 Mark Janes 2018-01-26 18:45:30 UTC
I just ran dist-upgrade on my KBL debian testing system to get the same versions.  I couldn't reproduce the hang with these files, unfortunately.

Can you specify other details about your system?

 - desktop environment
 - kernel version
 - is intel-microcode installed?

(In reply to Björn Stenberg from comment #5)
> These files trigger the problem:
> http://techslides.com/demos/samples/sample.mov

This file is in both lists ^^^^

> http://techslides.com/demos/samples/sample.mp4
> http://techslides.com/demos/samples/sample.mkv
> 
> These files DO NOT trigger the problem:
> http://techslides.com/demos/samples/sample.avi
> http://techslides.com/demos/samples/sample.flv
> http://techslides.com/demos/samples/sample.mov
> http://techslides.com/demos/samples/sample.mpg
> http://techslides.com/demos/samples/sample.swf
> http://techslides.com/demos/samples/sample.webm
> http://techslides.com/demos/samples/sample.wmv
> 
> I also noted that for the files that trigger the problem, vlc prints this at
> the start:
> libva info: VA-API version 1.0.0
> libva info: va_getDriverName() returns 0
> libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
> libva info: Found init function __vaDriverInit_1_0
> libva info: va_openDriver() returns 0
> 
> When I play the working files, these lines are not shown.
> 
> I'm guessing these version numbers are relevant:
>  i965-va-driver version 2.0.0+dfsg1-1
>  libva version 2.0.0.-2

I get the same output that you describe from the 2 lists, but the videos render properly.
Comment 8 Björn Stenberg 2018-01-31 14:44:37 UTC
Desktop: Xfce4 4.12.4
Kernel version: 4.14.13
intel-microcode 3.20180108.1+really20171117.1 (debian package version)

The .mov file belongs in the top list only. It triggers the problem.
Comment 9 Björn Stenberg 2018-01-31 14:57:37 UTC
Btw, this is a Lenovo Thinkpad T460s laptop with an i7-6600U. No external GPU.

I'm not sure how much system info is revealed by the gpu crash dump.
Comment 10 Andriy Khulap 2018-02-01 13:34:07 UTC
I'm unable to reproduce this issue on:
- Intel(R) Core(TM) i5-6440HQ CPU @ 2.60GHz
- Intel(R) HD Graphics 530 (Skylake GT2)  (0x191b)

- Ubuntu 16.04 LTS (drm-tip kernel 4.15.0, a2fbc8000254)
- Ubuntu and xfce-4.12 desktops.
- git mesa-17.2.5
- git mesa master

1) Ubuntu "stock" and ppa
- VLC media player 2.2.2 Weatherwax (revision 2.2.2-0-g6259d80)
- VLC media player 4.0.0-dev Otto Chriek (revision 4.0.0~rc1~~git20180130+r73866+123~ubuntu16.04.1)
- libva info: VA-API version 0.39.2
In all combinations vlc played without issues.

2) Upgraded to git versions
- VLC media player 3.0.0-git Vetinari (revision 3.0.0-git-0-g8d432b0)
- libva info: VA-API version 1.0.0
  (VA-API version: 1.0 (libva 2.0.1.pre1) Driver version: Intel i965 driver for Intel(R) Skylake - 2.0.1.pre1 (2.0.0-119-gf309448))
Again in all combinations vlc played without issues.

Will try debian buster then.
Comment 11 Andriy Khulap 2018-02-02 12:30:08 UTC
Installed Debian Buster Alpha2.

Was unable to reproduce the issue on initial config (kernel 4.13.0, mesa-17.2.5, vlc-2.2.2).

Upgraded to almost identical environment and was unable to reproduce again.

$ uname -a
Linux hkr1-ldl-f49728 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) x86_64 GNU/Linux

$ glxinfo | grep Mesa
client glx vendor string: Mesa Project and SGI
    Device: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2)  (0x191b)
OpenGL renderer string: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2) 
OpenGL core profile version string: 4.5 (Core Profile) Mesa 17.3.3
OpenGL version string: 3.0 Mesa 17.3.3
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 17.3.3

$ vlc ~/Videos/test2s.mov 
VLC media player 3.0.0-rc7 Vetinari (revision 3.0.0-rc7-0-gd427a9f90f)
[00005648f588fa30] main libvlc: Running vlc with the default interface. Use 'cvlc' to use vlc without interface.
libva info: VA-API version 1.0.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva info: va_openDriver() returns 0
[00007fcf18c18ea0] avcodec decoder: Using Intel i965 driver for Intel(R) Skylake - 2.0.0 for hardware decoding

Other software versions (according to synaptic):
xfce4           4.12.4
libdrm-intel1   2.4.89-1
i965-va-driver  2.0.0+dfsg1-1
libva2          2.0.0-2
intel-microcode 3.20180108.1+really20171117.1
Comment 12 Björn Stenberg 2018-02-06 11:03:18 UTC
I don't know how to help more. I completely understand this can't easily be debugged without reproduction, but I am not familiar enough with these subsystems to know what details to investigate and share. If you can think of anything, please tell me.

One minor timing detail is that before vlc hangs, it outputs these lines:

VLC media player 3.0.0-rc7 Vetinari (revision 3.0.0-rc7-0-gd427a9f90f)
[000055d66dd46a30] main libvlc: Running vlc with the default interface. Use 'cvlc' to use vlc without interface.
libva info: VA-API version 1.0.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva info: va_openDriver() returns 0

Then all graphics except the mouse pointer freezes. When I run "killall -9 vlc", first nothing happens for a few seconds and then as the graphics resumes and the vlc window vanishes, one more line appears on the terminal:

[00007f154cc18290] avcodec decoder: Using Intel i965 driver for Intel(R) Skylake - 2.0.0 for hardware decoding

I have no idea if this information is useful or not... :-)
Comment 13 Elizabeth 2018-03-06 19:53:23 UTC
You could try updating your drivers, and trying the mesa 17.5.6 release. As there isn't a specific root-cause for this issue not sure if it will help.
Comment 14 Elizabeth 2018-03-14 23:28:00 UTC
Typo, it was 17.3.6
Comment 15 GitLab Migration User 2019-09-25 19:07:50 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1679.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.