Created attachment 140304 [details] Crash file extrated from /sys/class/drm/card0/error On 4.18.0-rc2 and rc1 this error occurs during kernel boot: [ 0.342047] [drm] VT-d active for gfx access [ 0.342049] fb: switching to inteldrmfb from EFI VGA [ 0.342092] [drm] Replacing VGA console driver [ 0.342835] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 0.342835] [drm] Driver supports precise vblank timestamp query. [ 0.343538] [drm] Finished loading DMC firmware i915/skl_dmc_ver1_27.bin (v1.27) [ 0.343787] [drm] Disabling framebuffer compression (FBC) to prevent screen flicker with VT-d enabled [ 4.801171] [drm] GPU HANG: ecode 9:0:0xfffffffe, reason: hang on rcs0, bcs0, vcs0, vecs0, action: reset [ 4.801172] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 4.801172] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 4.801173] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 4.801173] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 4.801173] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 10.817596] [drm] Initialized i915 1.6.0 20180514 for 0000:00:02.0 on minor 0 [ 10.847297] fbcon: inteldrmfb (fb0) is primary device [ 10.927807] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device [ 11.207607] ata1.00: supports DRM functions and may not be fully accessible [ 11.210268] ata1.00: supports DRM functions and may not be fully accessible Kernel 4.17.X all boots fine. The GPU crash file is attached.
The GPU didn't even pretend to start. One quick test would be intel_iommu=igfx_off but it doesn't seem that related in this case. A bisection would be very, very helpful.
I assume you are using drm-tip as in rc2? Like Chris was saying would be nice to get this bisected. So if you can confirm you are using https://cgit.freedesktop.org/drm-tip and also send dmesg with drm.debug=0x1e log_buf_len=4M?
Created attachment 140338 [details] dmesg with the drm.debug=0x1e log_buf_len=4M
The kernel is 4.18.0-rc2 from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6. I will work on bisecting the kernel.
Thank you
Sorry for the delay. Finally I narrowed to this commit: ab96746aaa344fb720a198245a837e266fad3b62. I made a patch with the commit and uploaded it here (0001-iommu-vt-d-Clean-up-pasid-quirk-for-pre-production-d.patch). I think it has nothing to the GPU and it is related to IOMMU. Should we close this bug and open one on the kernel?
Created attachment 140373 [details] [review] Patch for the commit that if reverted fix the problem
I also tested the latest kernel 4.18.0-rc2+ (commit 813835028e9ae1f18cd11bb0ec591d0f0577d96a) and reverse appliying the patch and the kernel boots correctly without any failures of the GPU.
I think this is not our bug is good resolution for this and you should open new to kernel I think.
Alexandre, can you close if agree?
Alexandre, did you report another bug about this? Please add a reference here. Thanks.
Not reporter, but I found the new bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=200327 No idea about whether he took it to email or not.
I was traveling. I opened that bug but did not sent an email. Nobody has picked the bug report yet. Can you send an email?
*** Bug 107147 has been marked as a duplicate of this bug. ***
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.