Created attachment 139807 [details] /sys/class/drm/card0/error I'm using Fedora 28 with kernel 4.16.9-300.fc28.x86_64 mesa-dri-drivers 18.0.2-1.fc28 xorg-x11-drv-intel 2.99.917-32.20171025.fc28 This happened when I tried to open a website in Firefox while playing high-bitrate H.264 video using mpv (vaapi-copy). [мая27 04:50] [drm] GPU HANG: ecode 6:0:0x87e8effd, in Xorg [1364], reason: Hang on rcs0, action: reset [ +0,000002] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ +0,000000] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ +0,000001] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ +0,000000] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ +0,000001] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ +0,000047] i915 0000:00:02.0: Resetting chip after gpu hang [ +3,071473] asynchronous wait on fence i915:[global]:491aed timed out [ +4,928033] i915 0000:00:02.0: Resetting chip after gpu hang [ +8,960256] i915 0000:00:02.0: Resetting chip after gpu hang [мая27 04:51] i915 0000:00:02.0: Resetting chip after gpu hang [ +9,023965] i915 0000:00:02.0: Resetting chip after gpu hang
Can you try with latest drm-tip: https://cgit.freedesktop.org/drm-tip and send dmesg with drm.debug=0x1e log_buf_len=4M that is now on Linux version 4.17.0-rc6.
I'll try it, but it's rather hard to reproduce. It happens maybe once in several days of playing video.
It's a libva related bug. Once in a while the gpu gets itself into a state that it stops writing to memory (SDM specifically, but may not be limited to).
(In reply to Chris Wilson from comment #3) > It's a libva related bug. Once in a while the gpu gets itself into a state > that it stops writing to memory (SDM specifically, but may not be limited > to). Just to note, this also happened when the video was on pause.
Reported,m would you be able to test drm-tip? Now 4.18-rc2.
(In reply to Jani Saarinen from comment #5) > Reported,m would you be able to test drm-tip? Now 4.18-rc2. Was meant reporter, would you be able to test drm-tip? Now 4.18-rc2.
(In reply to Jani Saarinen from comment #6) > Was meant reporter, would you be able to test drm-tip? Now 4.18-rc2. Right now I was unable to trigger the bug. I've been using hardware video decoding on a stock Fedora kernel 4.17.2-200.fc28.x86_64 for several days but no problems so far. The problem occurs only under specific conditions which I don't know how to reproduce on purpose.
It happens again, I'll try to compile and trigger the bug with drm-tip.
Created attachment 140482 [details] gpu_error_4
With drm-tip 95944426a9ffda186843c78f2f925494e1bc53c5 I experience complete system lockup in under than 1 hour after system boot. The system does not respond to sysrq and does not repair in 5 minutes. All I do is playing H.264 50 fps 23 Mbit/s video using mpv vaapi-copy. It already happened 3 times. Because the system locks up, I can't provide you debug log and I doubt that netconsole will print out anything. I can't corroborate that the video subsystem is the cause of this lockup.
When the lockup occurs, audio output repeats last second of audio from a video file.
This is a kernel regression which probably is not because of GPU. After I updated to released kernel 4.17.5 (not from drm-next, just a usual kernel), I have the same complete system lockups as I had with drm-next in comment 10.
The problem does not occur with drm-tip commit 4aa6797dfafaf527949bf55d3c8513c6902dfec2 kernel (with additional patch 5ea45736209c8efd04ed793f81084925097f84ed from kernel 4.17.7 to fix lockup bug mentioned in comment 10, unrelated to GPU). I've been running it for 2 days, the video is constantly playing with vaapi and vaapi-copy hardware acceleration methods. No lockups occur. It is possible to backport patches in drm-tip to the mainline kernel?
Our drm-tip is pre-upstream tree that goes to mainline "automatically". Jani, how do you see this?
Please report if latest drm-tip works as it is now Linux version 4.18.0.
Reporter, were you able to see this issue with latest drmtip? If not, I can close this bug.
(In reply to Lakshmi from comment #16) > Reporter, were you able to see this issue with latest drmtip? If not, I can > close this bug. With 4.17.19-200.fc28.x86_64 Fedora kernel and fully updated system I no longer get GPU hangs. This bug is probably resolved.
Closing the bug.
Created attachment 141647 [details] GPU error 5 I think I found a video which instantly crashes the GPU.
Created attachment 141648 [details] Video which crashes the GPU Here's the video. Tested on 4.18.7-200.fc28.x86_64.
Reporter, Please try to reproduce the issue using drm-tip (https://cgit.freedesktop.org/drm-tip) and kernel parameters drm.debug=0x1e log_buf_len=4M, and if the problem persists attach the full dmesg from boot.
Created attachment 141676 [details] GPU crash (In reply to Lakshmi from comment #21) > Reporter, Please try to reproduce the issue using drm-tip > (https://cgit.freedesktop.org/drm-tip) and kernel parameters drm.debug=0x1e > log_buf_len=4M, and if the problem persists attach the full dmesg from boot. Done.
Closing this bug as it is not a kernel bug but userspace. Please report the bug to Vaapi team. https://github.com/intel/intel-vaapi-driver/issues/new
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.