Created attachment 123062 [details]
GPU crash dump
Hello, I am filing a new bug, because all the similar bugs I found were reproduced while running Xorg and/or chromium. I run linux in a console mode only (KMS).
I am running linux kernel 4.4.6. It starts normally, without any warnings. After that I am doing a kexec into another 4.4.6 kernel, configured differently. This warning appears in dmesg. After the warning, system runs normally.
here is the relevant information.
[ 4.741299] [drm] stuck on render ring
[ 4.742038] [drm] GPU HANG: ecode 7:0:0x87c3ffff, reason: Ring hung, action: reset
[ 4.742040] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 4.742041] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 4.742042] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 4.742042] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 4.742043] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 4.743293] drm/i915: Resetting chip after gpu hang
2. crash dump - attached
3. config files for the first and second (kexec-ed) kernel - attached
Created attachment 123063 [details]
config of kernel 1 - boot kernel
Created attachment 123064 [details]
config of kernel 2 - kexec-ed kernel
Please attach dmesg all the way from boot #1 to the problem, with drm.debug=14 module parameter set.
kexec may be a hard one to fix.
It looked like the GPU was still executing a batch when kexec took over.
Created attachment 123133 [details]
dmesg of the kexec-ed kernel, normal debug level
I apologize, but I am not able to produce a high-debug level dmesg right now. I will do so and attach as soon as I am able to.
Btw I tried kexec-ing the second kernel from the same kernel. The system I tried this had 1-2 days of uptime, running X and other stuff. The system froze totally with a black screen.
After a restart (power cycle) I tried again. This time the kexec worked as a charm - booted without any disturbing messages in the dmesg, not even the warning in the bug. I tried several more times and the result was correctly kexec-ed kernel and running system.
(In reply to dephlector from comment #6)
> I apologize, but I am not able to produce a high-debug level dmesg right
> now. I will do so and attach as soon as I am able to.
> Btw I tried kexec-ing the second kernel from the same kernel. The system I
> tried this had 1-2 days of uptime, running X and other stuff. The system
> froze totally with a black screen.
> After a restart (power cycle) I tried again. This time the kexec worked as a
> charm - booted without any disturbing messages in the dmesg, not even the
> warning in the bug. I tried several more times and the result was correctly
> kexec-ed kernel and running system.
Closing this bug since it looks like it is working fine. However, if this is happening again, please reopen the bug and attached all necessary logs.
Moreover, in this case, please ensure that you are using latest kernel since there was some work done that may already resolve it.