Bug 78356 - [3.14.2] [drm] Stuck on BSD ring
Summary: [3.14.2] [drm] Stuck on BSD ring
Status: RESOLVED WONTFIX
Alias: None
Product: libva
Classification: Unclassified
Component: intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: ykzhao
QA Contact: Sean V Kelley
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-06 20:20 UTC by Nicolas Hillegeer
Modified: 2016-12-07 03:02 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg log of the crash (112.61 KB, text/plain)
2014-05-06 20:20 UTC, Nicolas Hillegeer
Details
/sys/class/drm/card0/error output, gzipped (1000.45 KB, application/x-gzip)
2014-05-06 20:31 UTC, Nicolas Hillegeer
Details
/sys/class/drm/card0/error output, newer kernel, libdrm, libva and intel-driver, gzipped (1002.17 KB, text/plain)
2014-05-18 22:56 UTC, Nicolas Hillegeer
Details

Description Nicolas Hillegeer 2014-05-06 20:20:24 UTC
Created attachment 98583 [details]
dmesg log of the crash

On the latest kernel I just experienced a rather nasty hang which left the machine unworkable. I had to power-cycle to reset it.

I'm running on a modified debian wheezy system (amd64). By modified I mean most of the video-related packages have been updated by me, and the kernel as well.

I regret to inform that the file at /sys/class/drm/card0/error was empty after I rebooted the unit and got around to look at it. What can I do to keep this file around as long as possible?

What was I doing?
-----------------

Playing about 3 videos concurrently via mplayer with vaapi hardware acceleration. The videos were encoded in the h.264 format. There was a browser (latest chromium) and some flash presentations running as well.

Packages
--------
xorg                                  1:7.7+3~deb7u1
xserver-xorg-core                     2:1.12.4-6+deb7u2            
xserver-xorg-video-intel              2:2.21.15-1
libgl1-mesa-dri:amd64                 9.2.2-2
libgl1-mesa-glx:amd64                 9.2.2-2

ii  linux-image-3.14.2+                   3.14.2-1                           amd64        Linux kernel binary image for version 3.14.2+
ii  libva-dev                             1.3.0-1                            amd64        Video Acceleration (VA) API for Linux -- development files
ii  libva-drm1                            1.3.0-1                            amd64        Video Acceleration (VA) API for Linux -- DRM runtime
ii  libva-glx1                            1.3.0-1                            amd64        Video Acceleration (VA) API for Linux -- GLX runtime
ii  libva-intel-driver                    1.3.0-1                            amd64        VA driver for Intel G45 & HD Graphics family
ii  libva-wayland1                        1.3.0-1                            amd64        Video Acceleration (VA) API for Linux -- Wayland runtime
ii  libva-x11-1                           1.3.0-1                            amd64        Video Acceleration (VA) API for Linux -- X11 runtime
ii  libva1                                1.3.0-1                            amd64        Video Acceleration (VA) API for Linux -- Core runtime
ii  libva-intel-driver                    1.3.0-1                            amd64        VA driver for Intel G45 & HD Graphics family
Comment 1 Nicolas Hillegeer 2014-05-06 20:30:13 UTC
Listlessly cat'ing the error file suddenly gave me some output. It appears that the card had crashed again. The videos were still running though. I noticed strange artifacts (shimmering) when I looked at them. The quality was way off. Don't know if that has something to do with it. At any rate, I will attach the error file I captured.
Comment 2 Nicolas Hillegeer 2014-05-06 20:31:50 UTC
Created attachment 98585 [details]
/sys/class/drm/card0/error output, gzipped

This is the error output I was able to extract on my second meeting with this bug.
Comment 3 Chris Wilson 2014-05-06 20:43:00 UTC
The error state captures a hang in libva, so it may very well be a symptom of the same problem.
Comment 4 Nicolas Hillegeer 2014-05-06 20:47:04 UTC
I suspect it is, there's little else stressing the video subsystem at that point and I tried to replicate the workload exactly.

You may also remember that I reported another video hang error last year. This was on the same hardware and the 3.9.x series of kernels. The workload had to be quite a bit heavier to provoke it then though (20 simultaneous videos if I recall). Hopefully error reporting capabilities have improved enough so that this time it can be squashed.
Comment 5 Nicolas Hillegeer 2014-05-18 22:56:08 UTC
Created attachment 99285 [details]
/sys/class/drm/card0/error output, newer kernel, libdrm, libva and intel-driver, gzipped

I've upgraded my packages to:

kernel 3.14.4
libdrm 2.4.54
libva 1.3.1
intel-driver 1.3.1

It crashes all the same. I've attached a new error report, hope this helps. (sometimes it's difficult to get an error report because: gzip -9 < error > ~/error.gz tells me the file has changed since starting).
Comment 6 haihao 2014-05-19 04:28:13 UTC
Do you also use libvdapua-va-gl as you mentioned flush ? Can you reproduce this issue without browser ?
Comment 7 haihao 2014-05-19 04:30:24 UTC
s/flush/flash :(
Comment 8 Nicolas Hillegeer 2014-05-19 06:50:47 UTC
(In reply to comment #6)
> Do you also use libvdapua-va-gl as you mentioned flush ? Can you reproduce
> this issue without browser ?


Nope, the flash runs unaccelerated. It's just some light "presentations" anyway. I'm not sure if Chromium is deciding to composite surfaces on the GPU, it's possible I guess. Normally spoken, the only accelerated surfaces are h264 streams. I play them via an embedded mplayer-vaapi. This works absolutely wonderfully on gen7, but seems to crash really often (much more often than it was last year when I reported it, on gen6).

So for clarity: I do not have anything with "vdpau" installed.
Comment 9 ykzhao 2014-07-23 00:56:03 UTC
hi, Nicolas
    will you please describe the env that can be used to reproduce the GPu hang issue?
    Which hardware platform? haswell/Ivybride?
    Whhich bitstream is used ?
Comment 10 Nicolas Hillegeer 2014-07-23 06:50:30 UTC
(In reply to comment #9)
> hi, Nicolas
>     will you please describe the env that can be used to reproduce the GPu
> hang issue?
>     Which hardware platform? haswell/Ivybride?
>     Whhich bitstream is used ?

The platform was Sandy Bridge (gen6). The videos were in the h.264 format (do you need more info on that? If so, how do I obtain it?), played through VA-API (with vaapi-mplayer: https://gitorious.org/vaapi/mplayer/source/e4a658ef28e09e8441630f9028506f5cf7449480:). It seems like there have to be quite a few videos playing at the same time (the more, the better).
Comment 11 ykzhao 2014-07-23 06:56:05 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > hi, Nicolas
> >     will you please describe the env that can be used to reproduce the GPu
> > hang issue?
> >     Which hardware platform? haswell/Ivybride?
> >     Whhich bitstream is used ?
> 
> The platform was Sandy Bridge (gen6). The videos were in the h.264 format
> (do you need more info on that? If so, how do I obtain it?), played through
> VA-API (with vaapi-mplayer:
> https://gitorious.org/vaapi/mplayer/source/
> e4a658ef28e09e8441630f9028506f5cf7449480:). It seems like there have to be
> quite a few videos playing at the same time (the more, the better).

It will be better that you can share the bit-stream. 
In fact I try to play back several H264 videos concurrently on Sandybridge machine based on mplayer-vaapi,  there is no GPU hang.

So I hope that you can share the bit-stream in your test so that we can analysis the issue.
Comment 12 Nicolas Hillegeer 2014-07-23 07:14:48 UTC
Alright. When I get access to that machine again. I will note down which videos are playing and find a way to get them to you, may take one or two weeks. Thanks for looking into it!
Comment 13 haihao 2015-11-23 16:12:44 UTC
Do you still experience this issue? If yes, could you provide the stream in your testing?
Comment 14 haihao 2016-12-07 03:02:51 UTC
Closed as wontfix because of no response over 1 year. Please feel free to reopen the bug if you still have the this issue on SNB.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.