Bug 80720 - Reproducible GPU hang using gstreamer-vaapi and intel hardware on IVB
Summary: Reproducible GPU hang using gstreamer-vaapi and intel hardware on IVB
Status: RESOLVED DUPLICATE of bug 76363
Alias: None
Product: libva
Classification: Unclassified
Component: intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: haihao
QA Contact: Sean V Kelley
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-30 16:57 UTC by Simon Farnsworth
Modified: 2014-07-30 08:41 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Error state from GPU hang (979.03 KB, application/gzip)
2014-06-30 16:58 UTC, Simon Farnsworth
Details
a workaround for this issue (1.94 KB, patch)
2014-07-01 08:17 UTC, haihao
Details | Splinter Review

Description Simon Farnsworth 2014-06-30 16:57:38 UTC
On an Intel(R) Core(TM) i3-3220 CPU (on a H77 chipset board), using the following command line (I'll attach the media), I see a GPU hang delivering an error state:

gst-launch-1.0 filesrc location=jet\ car\ from\ the\ bridge\ at\ santapod\ raceway\ central\ day\ 2012\ \(Low\).mp4 ! qtdemux ! vaapidecode ! vaapisink

I can also reproduce the hang with:

gst-launch-1.0 filesrc location=jet\ car\ from\ the\ bridge\ at\ santapod\ raceway\ central\ day\ 2012\ \(Low\).mp4 ! qtdemux ! queue ! vaapidecode ! queue ! vaapisink

I'll attach the media and the error state.

Relevant software versions:

libva 1.3.1

intel-driver 1.3.1

gstreamer-vaapi from git revision a4bd8450

libdrm 2.4.54

kernel 3.14.4

I've built libva without GLX support, so I'm using an X11 display type

xorg-server 1.14.4

xf86-video-intel 2.21.15
Comment 1 Simon Farnsworth 2014-06-30 16:58:27 UTC
Created attachment 102022 [details]
Error state from GPU hang

The error state - shows BSD and BLT rings stuck.
Comment 2 Simon Farnsworth 2014-06-30 17:01:00 UTC
Media file is too big to attach, so I've placed it at:

http://90.155.96.198/sfarnsworth/jet%20car%20from%20the%20bridge%20at%20santapod%20raceway%20central%20day%202012%20(Low).mp4
Comment 3 haihao 2014-07-01 08:17:37 UTC
Created attachment 102060 [details] [review]
a workaround for this issue

The root cause is that the codec layer passes the wrong parameters to the driver. It would be better to fix this issue in gstreamer-vaapi.
Comment 4 Simon Farnsworth 2014-07-01 10:10:06 UTC
The attached patch prevents the GPU from hanging.

Would it be possible to apply this workaround or a similar workaround to the driver, just to ensure that you don't trigger a GPU hang on faulty input data?
Comment 5 Gwenole Beauchesne 2014-07-03 12:40:22 UTC
(In reply to comment #4)
> The attached patch prevents the GPU from hanging.
> 
> Would it be possible to apply this workaround or a similar workaround to the
> driver, just to ensure that you don't trigger a GPU hang on faulty input
> data?

Haihao, please don't workaround, but add a patch that *errors* out if first_mb_in_slice < first_mb_in_slice of the previous slice. Arbitrary Slice Ordering is not supported (baseline profile feature). No need to check the profile, just check for that condition and return an error. Thanks.
Comment 6 haihao 2014-07-04 00:37:30 UTC
Hi, Gwenole

I will add the error checking in the driver to avoid GPU hang. Actually the profile of the sample video is constrained profile and there is a *real* bug in gstreamer-vaapi. Gstreamer vaapi mixes up the two consecutive IDR frames and  FFmpeg-vaapi doesn't have this issue. So it would be better to fix this issue in gstreamer-vaapi.

Thanks
Haihao
Comment 7 ykzhao 2014-07-23 08:05:28 UTC
Now another patch is pushed to the libva-intel-driver, which will check the first_mb_in_slice field of input slice_parameter. It will return the error status when it fails in the checking and prompts that the issue had better be fixed in upper-middleware.

The commit id is :
   >commit 82d2ed8d7da3619c0ea467c06604f5626fc0b901
Author: Zhao Yakui <yakui.zhao@intel.com>
Date:   Wed Jul 23 13:46:17 2014 +0800

    Add more check of H264 slice param to avoid GPU hang caused by the incorrect parameter


Hi, Simon
    Will you please check it and see whether it work for you?

Thanks.
    Yakui
Comment 8 haihao 2014-07-30 08:41:05 UTC

*** This bug has been marked as a duplicate of bug 76363 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.