Created attachment 140080 [details] dmesg and /sys/class/drm/card0/error My GIGABYTE Q21B/Q21B UI is always broken. I found some GPU hangs error from dmesg. The gpu resets multiple times after hang. model name: GIGABYTE Q21B/Q21B dmesg: [ 35.849634] [drm] GPU HANG: ecode 8:1:0x21204b77, in X [437], reason: Hang on bcs0, action: reset [ 35.849639] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 35.849640] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 35.849642] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 35.849643] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 35.849645] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 35.849663] i915 0000:00:02.0: Resetting bcs0 after gpu hang [ 42.848209] i915 0000:00:02.0: Resetting rcs0 after gpu hang [ 51.840245] i915 0000:00:02.0: Resetting rcs0 after gpu hang [ 61.824125] i915 0000:00:02.0: Resetting rcs0 after gpu hang [ 70.848282] i915 0000:00:02.0: Resetting rcs0 after gpu hang [ 79.840180] i915 0000:00:02.0: Resetting rcs0 after gpu hang
blt command stream: IDLE?: no START: 0x00009000 HEAD: 0x00000068 [0x00000000] head = 0x00000068, wraps = 0 TAIL: 0x00000130 [0x00000078, 0x00000098] CTL: 0x00003001 len=16384, enabled MODE: 0x00000000 HWS: 0x00004000 ACTHD: 0x00000000 0008801c at ring: 0x00000000 IPEIR: 0x00000008 IPEHR: 0xdedfb480 INSTDONE: 0xfffffff7 busy: HS batch: [0x00000000_00000000, 0x00000000_00001000] BBADDR: 0x00000000_00088019 BB_STATE: 0x00000020 INSTPS: 0x00000000 INSTPM: 0x00000000 FADDR: 0x00000000 00088200 RC PSMI: 0x00000010 FAULT_REG: 0x000000c9 Valid Invalid PTE Fault Engine GFX Source ID 25 SYNC_0: 0x00000000 SYNC_1: 0x00000000 SYNC_2: 0x00000000 GFX_MODE: 0x00008000 PDP0: 0x000000007bfd9000 PDP1: 0x000000007bfe6000 PDP2: 0x000000007bfe6000 PDP3: 0x000000007bfe6000 seqno: 0x00000002 last_seqno: 0x00000005 waiting: yes ring->head: 0x00000000 ring->tail: 0x00000130 hangcheck stall: yes hangcheck action: dead hangcheck action timestamp: 4294899752, 4725752 ms ago engine reset count: 0 ELSP[0]: pid 437, ban score 0, seqno 2:00000005, prio 1024, emitted 4726568ms ago, head 000000e8, tail 00000130 Active context: X[437] user_handle 0 hw_id 2, prio 0, ban score 0 guilty 0 active 0 Another early batch (first user), IPEHR of garbage, a page fault. Houston we have problem.
Please build a kernel from https://cgit.freedesktop.org/drm-tip and test. It will be important later on for testing patches, anyway.
See also #106828
Created attachment 140086 [details] dmesg, /sys/class/drm/card0/error, and build config Thanks for your quickly reply. I have tried to build the kernel 4.17.0-rc7+ from https://cgit.freedesktop.org/drm-tip. Unfortunately, this issue still persist cannot be resolved. I also uploaded an attachment here. It included dmesg, error dump, and my build config. Hope it is helpful to resolve this issue.
Hmm. Can you do a quick run with CONFIG_INTEL_IOMMU disabled (./scripts/config -d CONFIG_INTEL_IOMMU)?
Oh, and just in case I forget later, this is no longer hanging on the first batch.
(In reply to Chris Wilson from comment #6) > Oh, and just in case I forget later, this is no longer hanging on the first > batch. Yes it is. Just rcs this time, and not a garbage IPEHR.
Disabled Intel_IOMMU still not work.
Can you re-attach the dmesg log after adding the following kernel parameters: drm.debug=0x1e log_buf_len=4M
Also, does this happen every time?
Created attachment 140130 [details] Debug log after DRM's log enabled. I've uploaded the attachment for DRM logs enabled.
(In reply to James Ausmus from comment #10) > Also, does this happen every time? Yes, the issue happened every time during startx running.
Created attachment 140131 [details] Xorg.0.log Uploaded Xorg.0.log.
Thanks for the additional details and logs!
HI, Do you have any updates on it?
Francesco, any updates on this issue?
(In reply to circle_chen from comment #15) > HI, > Do you have any updates on it? Can you try this again but this time just stick to the modesetting driver? (e.g. remove xorg-x11-drv-intel (fedora) or xorg-x11-drv-intel (deb)). And attach the Xorg logs.
Created attachment 142274 [details] Add Xorg.0.log for disabled Intel driver. Add the log for disabled Intel driver. (Screen is hang on blank cursor)
Hmm, looks like our options of narrowing the problem is getting dimmer now that the ddx is apparently innocent. I'd just like to verify at this point this is not a hw problem of some sort in your system. Can you drop your run-level down to command-line and run IGT[1] testdisplay as root? igt/tests$ ./testdisplay Also you reported the hang on a blank cursor? You can try running tests/kms_cursor_crc as well If one or both tests triggers a hang, please attach the /sys/class/drm/card0/error [1] https://github.com/freedesktop/xorg-intel-gpu-tools
circle_chen, were you able to try Abdiel's test?
Created attachment 142591 [details] attachment-21972-0.html Yes, sorry for late reply. I will do the test next week. <bugzilla-daemon@freedesktop.org>於 2018年11月23日 週五,下午7:21寫道: > *Comment # 20 <https://bugs.freedesktop.org/show_bug.cgi?id=106858#c20> on > bug 106858 <https://bugs.freedesktop.org/show_bug.cgi?id=106858> from > Francesco Balestrieri <francesco.balestrieri@intel.com> * > > circle_chen, were you able to try Abdiel's test? > > ------------------------------ > You are receiving this mail because: > > - You reported the bug. > > -- Yours sincerely, 陳至圓 敬上 Chih-Yuan Chen 0952-630-832 anyo0928@gmail.com
Ping?
Hi I have downloaded xorg-intel-gpu-tools-intel-gpu-tools-1.19 and compiled successful. And then boot the kernel and install the tools/libs to file system to run ./testdisplay. But I got "syntax error: unexpected end of file". We are not sure what wrong in my environment. Could you provided a boot image let us test more easily?
If you didn't manage to compile IGT, please use your own distro's package binaries
Circle Chen, any updates here? Were you able to run the tests as in Comment 19? Feedback is needed to proceed further with this bug.
No feedback for more than 2 months, closing this bug as WORKSFORME. When you experience the same problem with drmtip, please attach dmesg log from boot with kernel parameters drm.debug=0x1e log_buf_len=4M. Remember to attach error file and xorg log as well.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.