Bug 69738

Summary: [SNB] vaapi playback causes drm stuck on render ring w/RC6 enabled
Product: libva Reporter: Joe Konno <joe.konno>
Component: intelAssignee: ykzhao <yakui.zhao>
Status: RESOLVED INVALID QA Contact: Sean V Kelley <seanvk>
Severity: critical    
Priority: medium CC: gb.devel, haihao.xiang, intel-gfx-bugs, rodrigo.vivi, wayland-bugs
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Joe Konno 2013-09-23 21:43:35 UTC
gst-vaapi playback under Wayland/Weston 1.2.91 is no longer functional. Playback can occur for seconds, but then hangs on a frame. At that point, dmesg reports "*ERROR* stuck on render ring". From that point on, the display becomes unresponsive and system becomes highly volatile, requiring a hard power-down of the unit (soft reboot and soft shutdown inoperative).

#To Reproduce

  1. Launch Weston
    weston-launch -- -i0

  2. Play a hardware-accelerated video through VA-API
    gst-launch-1.0 filesrc \
      location=$HOME/Videos/big_buck_bunny_1080p_h264.mp4 ! \
      qtdemux ! \
      vaapidecode ! \
      vaapisink fullscreen=true

#Expected Result

Video plays from beginning to end smoothly without issue.

#Actual Result

Video hangs seconds after playback begins, and enters an unrecoverable error state.

dmesg emits:
[drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring

Display hangs on last displayed frame. VT-switching, input devices, and display in general unresponsive and useless. `shutdown` commands issued over ssh appear inoperative.

Only known recovery method is to power down the hardware.

#Workaround

Append kernel boot parameter: i915.i915_enable_rc6=0

This kernel boot parameter allows playback to continue, though it is not a permanent solution.

#Configuration

  Hardware: "Sandybridge" era CPU with integrated graphics, Celeron SKU
  OS: Fedora 19 64-bit
  Kernel: 3.11.1-200.fc19

  wayland (HEAD) 1.2.91-0-g4125367
  mesa (HEAD) mesa-9.2-0-g46273ba
  cairo (HEAD) 1.12.16-0-g8e11a42
  weston (HEAD) 1.2.91-0-g7799385
  libva (HEAD) libva-1.2.1-0-g88ed1eb
  intel-driver (HEAD) 1.2.1-0-g8f306e3
  gstreamer (HEAD) 1.0.9-0-gf3c4f74
  gst-plugins-base (HEAD) 1.0.9-0-gffc5262
  gst-plugins-good (HEAD) 1.0.9-0-gff2598f
  gst-plugins-bad (HEAD) 1.0.9-0-g7ba6694
  gst-ffmpeg (HEAD) 1.0.9-0-gd1c488b
  gstreamer (HEAD) 1.0.9-0-gf3c4f74
  gst-plugins-base (HEAD) 1.0.9-0-gffc5262
  gst-plugins-good (HEAD) 1.0.9-0-gff2598f
  gst-plugins-bad (HEAD) 1.0.9-0-g7ba6694
  gst-ffmpeg (HEAD) 1.0.9-0-gd1c488b
  gstreamer-vaapi (HEAD) tags/0.5.6-0-g0687224

#Related Bugs

Perhaps bug #69330 ?
Comment 1 U. Artie Eoff 2013-09-27 20:11:19 UTC
This is the last known good stack that works for me on sandybridge:

wayland (HEAD) 1.2.91-0-g4125367
drm (HEAD) libdrm-2.4.46-0-gc6d73cf
mesa (9.2) heads/9.2-0-gab93225
libva (HEAD) libva-1.1.1-0-g8cf7d80
intel-driver (HEAD) 1.0.20-0-g4ae55f8
weston (HEAD) 1.2.91-0-g7799385
gstreamer (HEAD) 1.0.9-0-gf3c4f74
gst-plugins-base (HEAD) 1.0.9-0-gffc5262
gst-plugins-good (HEAD) 1.0.9-0-gff2598f
gst-plugins-bad (HEAD) 1.0.9-0-g7ba6694
gst-ffmpeg (HEAD) 1.0.9-0-gd1c488b
gstreamer-vaapi (HEAD) tags/0.5.3-0-gaf9202b
Comment 2 U. Artie Eoff 2013-09-27 21:04:34 UTC
(In reply to comment #0)
> <snip>
> 
>   wayland (HEAD) 1.2.91-0-g4125367
>   mesa (HEAD) mesa-9.2-0-g46273ba
>   cairo (HEAD) 1.12.16-0-g8e11a42
>   weston (HEAD) 1.2.91-0-g7799385
>   libva (HEAD) libva-1.2.1-0-g88ed1eb
>   intel-driver (HEAD) 1.2.1-0-g8f306e3
>   gstreamer (HEAD) 1.0.9-0-gf3c4f74
>   gst-plugins-base (HEAD) 1.0.9-0-gffc5262
>   gst-plugins-good (HEAD) 1.0.9-0-gff2598f
>   gst-plugins-bad (HEAD) 1.0.9-0-g7ba6694
>   gst-ffmpeg (HEAD) 1.0.9-0-gd1c488b
>   gstreamer (HEAD) 1.0.9-0-gf3c4f74
>   gst-plugins-base (HEAD) 1.0.9-0-gffc5262
>   gst-plugins-good (HEAD) 1.0.9-0-gff2598f
>   gst-plugins-bad (HEAD) 1.0.9-0-g7ba6694
>   gst-ffmpeg (HEAD) 1.0.9-0-gd1c488b
>   gstreamer-vaapi (HEAD) tags/0.5.6-0-g0687224
> 
> #Related Bugs
> 
> Perhaps bug #69330 ?

Ok, so I rolled back gstreamer-vaapi to 0.5.3 and kept all else equal (in your above stack) and the problem went away for me.

So this bug needs to be filed in gstreamer-vaapi's database
Comment 3 Joe Konno 2013-09-27 22:41:21 UTC
Result of bisect. Troubles begin here:

commit 976d27841a19cf37e2742cae5770c8a7555f72db
Author: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Date:   Fri Feb 15 18:50:26 2013 +0200

    h264: add support for video cropping.
    
    If the encoded stream has the frame_cropping_flag set, then associate
    the cropping rectangle to GstVaapiPicture.
    
    Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>

As of the aforementioned commit, the gst-vaapi application crashes and burns with (thanks uartie for this), but that is not the issue reported in this bug. So, that means a successful bisect is impeded by the introduction of, at least, one more regression.

0:00:01.070853827  6822      0x117cb20 DEBUG              vaapisink gstvaapisink.c:968:gst_vaapisink_put_surface: could not render VA surface

Will need to partner up in order to continue root-causing. I don't have sufficient time to test every commit since 976d278.
Comment 4 Gwenole Beauchesne 2013-09-30 12:56:05 UTC
Hi, this looks totally independent from gst-vaapi. I could not reproduce it on SNB (32-bit though), with an older kernel 3.8.0-31. The fact that this could be workarounded by disabling RC6 tends to confirm that. Note: since that aforementioned change, we also use a video processing pipeline.

Needs to be checked internally. Kernel & libva driver may need changs.
Comment 5 Joe Konno 2013-09-30 14:01:13 UTC
Here was an i915 kernel driver commit that was brought to my attention:

http://cgit.freedesktop.org/~danvet/drm-intel/commit/?h=drm-intel-next&id=351aa5666d02062b52329bcfe4bcf9d1f882fba9

I'll see if this has any impact.
Comment 6 Joe Konno 2013-09-30 15:55:52 UTC
This uncovers an additional work-around. If the stated patch is cherry-picked atop a vanilla v3.11.1 kernel, the reported issue is no longer seen with gst-vaapi 0.5.6.

(In reply to comment #5)
> Here was an i915 kernel driver commit that was brought to my attention:
> 
> http://cgit.freedesktop.org/~danvet/drm-intel/commit/?h=drm-intel-
> next&id=351aa5666d02062b52329bcfe4bcf9d1f882fba9
> 
> I'll see if this has any impact.
Comment 7 U. Artie Eoff 2014-05-15 16:54:33 UTC
Any updates?  Should this be assigned to the DRI DRM/Intel team since it appears to be a drm kernel driver bug?
Comment 8 haihao 2015-11-23 14:59:07 UTC
According to comment #7, change component to DRM/intel.
Comment 9 Chris Wilson 2015-11-23 16:20:42 UTC
Considering the numerous reports of SNB libva gpu hangs, and the numerous known bugs in libva, it is in all likely a libva bug. Please do due diligence first.
Comment 10 ykzhao 2015-11-26 04:00:03 UTC
Hi, Artie
    
    Is the issue still reproduced by using the latest intel-driver?

    If the issue can be reproduced, is it helpful to disable the PPGTT?
(It can be disabled by adding the kernel option of "i915.enable_ppgtt=0").

Thanks.
Comment 11 ykzhao 2016-01-06 02:19:56 UTC
Since there is no update, this bug will be closed.

If the issue still exists and can be reproduced, please reopen it or file one new bug. Of course it will be better that the mentioned boot option of "i915.enable_ppgtt=0" can be tried.

Thanks.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.