Summary: | GPU Hang on Haswell with VAAPI accleration on XBMC (reproducible) | ||
---|---|---|---|
Product: | libva | Reporter: | MattDevo <matt.devillier> |
Component: | intel | Assignee: | Lizhong <zhong.li> |
Status: | RESOLVED FIXED | QA Contact: | Sean V Kelley <seanvk> |
Severity: | major | ||
Priority: | medium | CC: | fernetmenta, fritsch, gb.devel, zhixinx.liu |
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
XBMC Log
kernel log (dmesg) GPU dump log (zipped for size) libva-intel-driver patch to fix bug 81447 |
Description
MattDevo
2014-07-16 23:44:36 UTC
Created attachment 102950 [details]
kernel log (dmesg)
Created attachment 102951 [details]
GPU dump log (zipped for size)
Is it the duplication of https://bugs.freedesktop.org/show_bug.cgi?id=78960 ? Could you provide the kernel dmesg ? (In reply to comment #3) > Is it the duplication of https://bugs.freedesktop.org/show_bug.cgi?id=78960 > ? Could you provide the kernel dmesg ? the XBMC developers have told me they believe it to be two separate issues. I did attach the kernel dmesg, is it insufficient somehow? @haihao we observe this issue without vaapi renderer, means we copy the video surface to system memory. 78960 happens only with vaapi rendering when running this sample on Windows with DXVA, the hw decoders shows errors which results in a hardly noticeable glitch. No doubt, this sample is corrupted somehow but it must not make the GPU hang. I got the information that the sample is most likely NOT corrupted. So I tried again on Windown DXVA but now with NVidia Graphics. All fine. Seems to be an Intel problem on all platforms. Windows: small glitch and hw decoders shows error Linux: GPU hang NVidia: all fine. I invested some further time today to find proper backtraces. It seems that for this bug vaSyncSurface never returns. This could be a libdrm bug - as the i965 driver only calls: if(obj_surface->bo) drm_intel_bo_wait_rendering(obj_surface->bo); That really looks like a threading issue in the driver in combination with GL Output. What "real life" use cases can trigger drm_intel_bo_wait_rendering wait for ever? I use mplayer-vaapi to hw-decode this video, GPU also hang when decoding frame 327. Then I tried to use softeware way to decode this video, mplayer also show errors as follow: mplayer -vo x11 /root/Joe_sample.mkv -fps 30 error:[h264 @ 0xf64894c0]concealing 1285 DC, 1285 AC, 1285 MV errors A: 11.9 V: 13.7 A-V: -1.742 ct: -1.064 327/327 63% 7% 1.1% 0 0 [h264 @ 0xf64894c0]top block unavailable for requested intra4x4 mode -1 at 50 17 [h264 @ 0xf64894c0]error while decoding MB 50 17, bytestream (21448) [h264 @ 0xf64894c0]top block unavailable for requested intra mode at 10 34 [h264 @ 0xf64894c0]error while decoding MB 10 34, bytestream (20429) [h264 @ 0xf64894c0]top block unavailable for requested intra mode at 37 51 [h264 @ 0xf64894c0]error while decoding MB 37 51, bytestream (9038) [h264 @ 0xf64894c0]concealing 8159 DC, 8159 AC, 8159 MV errors It means there are some error MBs in frame 327, which cause GPU hang when hw-decoding. It seems it's hard to decode this frame correctly since it's an error frame. But maybe we can aviod gpu hang or drop this frame. I'll further check it. yes, you are right. sw decode fires this error. thanks very much for looking into this! Created attachment 104061 [details] libva-intel-driver patch to fix bug 81447 Hi Rainer Hochecker: Could you verify my attachment patch is helpful to fix this bug? frame 327 miss a slice data and some MB data according my analysis. Thanks Rainer is currently on hollidays. I will try tonight on my hsw hardware and report back. Could very well be that someone will do before me, via: http://forum.xbmc.org/showthread.php?tid=165707&page=54 Thanks for looking into this. For the history - the second line of your patch comment has an "/" too much which will break compilation: Fixed one: http://paste.ubuntu.com/7960572/ Patch is working as expected. I see a short stutter at that scene - like a frame is dropped - and afterwards it continues to play. Thanks much. Would be nice if that patch is applied to master prior to 1.3.3 or 1.4.0 is released. Thanks for your test and patch typo fixing. Yes, I dropped the error frame by checking slice parameters. As I said, "It seems it's hard to decode this frame correctly since it's an error frame. But maybe we can aviod gpu hang or drop this frame.“ Software player also show decoding error. We will apply this bug fixing into mater branch. (In reply to comment #0) > Issue is reproducible using test file: > https://dl.dropboxusercontent.com/u/55728161/Joe_sample.mkv Hi, using libva-intel-driver 1.3.2 I could reproduce the bug. Using latest git master http://cgit.freedesktop.org/vaapi/intel-driver/commit/?id=82d2ed8d7da3619c0ea467c06604f5626fc0b901 and this patch https://github.com/OpenELEC/OpenELEC.tv/blob/master/packages/multimedia/libva-intel-driver/patches/libva-intel-driver-FD81447.patch the bug is fixed. The patch you reference does not fix the bug we see with the above sample - it still hangs, but there is no kernel hang anymore yes. The real fix was sent to the ML yesterday, see: http://lists.freedesktop.org/archives/libva/2014-August/002565.html @bkuhls: Sorry, I did not read your comment correctly. Yes, we picked this patch to OpenELEC just after it was released. It is also included in 4.1.3 OE beta release. We will ship it until the new libva-driver-intel with that fix included will be released. For Ubuntu we provide a fixed driver easy to install via the wsnipex vaapi ppa. Updated patches have been sent to mail list. This bug will be marked as fixed. (In reply to comment #20) > Updated patches have been sent to mail list. This bug will be marked as > fixed. Bugs are marked as fixed only when proper fixes reached the git repository. And, I would say, the "master" branch. Otherwise, we get in a situation where the bug is marked as fixed but the actual fix got lost in the mailing-list, which is the precise situation here. Thus reopening the bug. Patch was merged into master branch years ago. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.