Bug 84688

Summary: [regression] Userptr freezes the system - VA-API and libdrm 2.4.58
Product: DRI Reporter: nfnty <dpohyggxzblfgtbugnrg8m2fd2n3i2pn>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: medium CC: andyrtr, damien.lespiau, intel-gfx-bugs, przanoni
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
XBMC log before freeze
none
Kernel log after freeze without drm.debug
none
Kernel log after freeze with drm.debug none

Description nfnty 2014-10-05 15:17:53 UTC
System freezes when using VA-API as hardware-accelerated decoder in VLC. Sometimes only the pointer can be moved, although stuttering heavily.

Works great with mpv and XBMC with libdrm 2.4.58-1. No artifacts or stuttering. Though system randomly froze once last night when trying to play a video in XBMC with codec "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10" and codec id 28.

After downgrading to libdrm 2.4.56-1:
No more system freezes.
Video codec H264 - MPEG-4 AVC (part 10) (h264) displays fine.
Video codec H264 - MPEG-4 AVC (part 10) (avc1) has severe artifacts and after terminating VLC 1 to 2 kernel threads called systemd-coredum each have 100% load. These threads linger for several minutes.

Running on i3-2348M.

Arch Linux package versions:

linux 3.16.3-1
xf86-video-intel 2.99.916-3
libdrm 2.4.58-1
libva 1.4.0-1
libva-intel-driver 1.4.0-1

vlc 2.1.5-3
xbmc 13.2-4
mpv 0.6.0-1
Comment 1 nfnty 2014-10-05 17:48:15 UTC
Created attachment 107363 [details]
XBMC log before freeze

System froze again while trying to play video in XBMC. Attached is a log before freeze and hard reset.
Comment 2 Paulo Zanoni 2014-10-06 14:19:25 UTC
If reverting to an older libdrm fixes the problem, maybe you could try to bisect libdrm and see which commit exactly causes the system to start freezing. Can you please do this?

You also mentioned sometimes the pointer can be moved after the system "freezes": can you still SSH to it and grab the output of "dmesg"? If you could boot with drm.debug=0xe, then reproduce the bug and grab the dmesg, it would really help us.
Comment 3 nfnty 2014-10-07 02:07:59 UTC
Created attachment 107449 [details]
Kernel log after freeze without drm.debug
Comment 4 nfnty 2014-10-07 02:08:32 UTC
Created attachment 107450 [details]
Kernel log after freeze with drm.debug
Comment 5 nfnty 2014-10-07 02:17:08 UTC
Traced the bug back to commit ae8edc7544e566084f7b958eb93c9109b471ca30. No freezes before this commit.

VLC picture still has artifacts. On exit VLC still segfaults, locks one or two cpu thread(s) and produces this kernel message:

Oct 07 04:00:58 laptop kernel: vlc[594]: segfault at 0 ip 00007f0043a719ef sp 00007f006eb6d948 error 4 in libvaapi_plugin.so[7f0043a6d000+7000]
Comment 6 Paulo Zanoni 2014-10-07 12:57:45 UTC
Having the full dmesg file, instead of just the lines containing the error, would help even more. Please attach them too.

It looks like the Kernel is broken, and the libdrm patch just expose the Kernel's brokeness to the user space...
Comment 7 Chris Wilson 2014-10-07 13:03:26 UTC
That's old news:

commit 0cc4afd7d05da2144b464a00333a1aa4615f7b2f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 18 09:39:44 2014 +0100

    drm/i915: Prevent recursive deadlock on releasing a busy userptr

The decoder corruption and crash is something else.
Comment 8 Chris Wilson 2014-10-07 13:04:12 UTC
Bleh: -fixes commit ad46cb533d586fdb256855437af876617c6cf609
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Aug 7 14:20:40 2014 +0100

    drm/i915: Prevent recursive deadlock on releasing a busy userptr
Comment 9 Paulo Zanoni 2014-10-07 13:09:55 UTC
It seems those patches will be sent to the stable trees, so I guess our bug reporter could try to test the latest Kernel from Linus or stable trees, if they already contain the patches.
Comment 10 Paulo Zanoni 2014-10-07 13:27:47 UTC
Closing bug, since the fixes are already on our development trees and heading to the stable trees.

If you can test a Kernel that already contains both patches mentioned by Chris and you can still reproduce the problem, please reopen this bug report.
Comment 11 nfnty 2014-10-07 16:53:14 UTC
Kernel 3.17 solves the problem.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.