Bug 111312 - GPU Hang while using GStreamer
Summary: GPU Hang while using GStreamer
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
Whiteboard: Triaged
Depends on:
Reported: 2019-08-07 07:15 UTC by Pascal Jacquemart
Modified: 2019-08-15 06:13 UTC (History)
1 user (show)

See Also:
i915 platform: KBL
i915 features: GPU hang

GPU Crash dump (25.98 KB, text/plain)
2019-08-07 07:15 UTC, Pascal Jacquemart
no flags Details
dmesg (1001 bytes, text/plain)
2019-08-07 07:20 UTC, Pascal Jacquemart
no flags Details
dmesg full (81.47 KB, text/plain)
2019-08-07 07:20 UTC, Pascal Jacquemart
no flags Details
vainfo output (2.21 KB, text/plain)
2019-08-07 07:22 UTC, Pascal Jacquemart
no flags Details
Reproduced the same issue with drm.debug (206.90 KB, text/x-log)
2019-08-09 02:02 UTC, Pascal Jacquemart
no flags Details
Matching GPU crash dump (23.07 KB, text/x-log)
2019-08-09 02:03 UTC, Pascal Jacquemart
no flags Details
VP8 stream recorded with gstwebrtcbin (2.25 MB, video/webm)
2019-08-12 03:12 UTC, Pascal Jacquemart
no flags Details

Description Pascal Jacquemart 2019-08-07 07:15:42 UTC
Created attachment 144963 [details]
GPU Crash dump

Skylake GPU hang while using GStreamer in a WebRTC call.
It means GStreamer is simultaneously encoding and decoding VP8 (in 1080p) using VAAPI. The hang occurs very quickly (less than 5 minutes) when the bandwidth is limited and packets are lost.

Issue seems similar to #110394 but this time related to VP8.
Also it is very easy to reproduce (few minutes only).

Platform: x86_64
Kernel: 3.10.0-693.2.2.el7.x86_64
Linux distribution: CentOS 7.4  http://archive.kernel.org/centos-vault/7.4.1708/isos/x86_64/CentOS-7-x86_64-DVD-1708.iso
Machine: NUC7i5BNK
Display connector: thunderbolt

GStreamer version: 1.16.0
libva: commit 457470987cc9df5976ce8c72ffd4bfbd9baaf0f9
libva-intel-driver: commit f1d9ceddc0e84ed8d44dd59017b0e19b75dd5dcd
xf86-video-intel: commit 6f4972d5c368c30e971a23c1dc370d3e43761282

We are working on a way to dump and replay the network traffic to isolate this issue and make it easily reproducible.
Comment 1 Pascal Jacquemart 2019-08-07 07:20:01 UTC
Created attachment 144964 [details]

The end of dmesg when the GPU hang occurs
Comment 2 Pascal Jacquemart 2019-08-07 07:20:41 UTC
Created attachment 144965 [details]
dmesg full

The entire kernel log in case it is usefull
Comment 3 Pascal Jacquemart 2019-08-07 07:22:37 UTC
Created attachment 144966 [details]
vainfo output
Comment 4 Chris Wilson 2019-08-07 18:49:19 UTC
Finger currently points towards libva, haven't seen anything in there that indicates a kernel bug. If you can get hold of an upstream kernel, just to rule out the frankenkernel that would be reassuring.
Comment 5 Lakshmi 2019-08-08 06:19:38 UTC
Reporter, can you verify the issue the issue with drmtip and give the feedback (https://cgit.freedesktop.org/drm-tip)?
Comment 6 Pascal Jacquemart 2019-08-09 02:02:36 UTC
Created attachment 144989 [details]
Reproduced the same issue with drm.debug

I have also updated libva and intel-vaapi-driver to version 2.3.0 as per 2018Q1 Intel Graphics stack recipe
Comment 7 Pascal Jacquemart 2019-08-09 02:03:25 UTC
Created attachment 144990 [details]
Matching GPU crash dump
Comment 8 Pascal Jacquemart 2019-08-09 02:10:14 UTC
Understood I have to try drm-tip kernel...
But I think before trying to change any piece of software I will try to investigate deeper and isolate the issue first.

Do you think the following bug is totally unrelated?
Comment 9 Pascal Jacquemart 2019-08-12 03:12:04 UTC
Created attachment 145033 [details]
VP8 stream recorded with gstwebrtcbin

Attached is a VP8 recording done through the GstWebRTCBin.
To make this recording, I started a WebRTC call and make sure the network bandwidth is not sufficient to convey the video.

The incoming VP8 stream is recorded in a .webm file
The file can be replayed with the following GStreamer pipeline:

gst-launch-1.0 filesrc location=vp8_recording.webm ! matroskademux ! video/x-vp8 ! vaapivp8dec ! queue ! videoconvert ! xvimagesink

GPU hang occurs after 33 seconds when the video gets corrupted...

Even simpler pipeline can be used (without visual feedback):
gst-launch-1.0 filesrc location=/home/proex/vp8_recording.webm ! matroskademux ! video/x-vp8 ! vaapivp8dec ! fakesink

In this case the GPU hangs in less than 5 seconds.
Comment 10 Pascal Jacquemart 2019-08-15 01:21:52 UTC
We were able to reproduce the issue on Intel(R) Core(TM) i7-7500U
Software stack is:
linux 5.1.9
libdrm 2.4.98
mesa 19.0.6
libva 2.4.1
libva-intel-driver 2.3.0
Comment 11 Lakshmi 2019-08-15 06:13:01 UTC
This issue seems to be a libva issue. Can you please report this issue here https://github.com/intel/libva/issues.

As of now, we don't see anything that needs a fix from kernel side.Closing this issue as NOTOURBUG.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.