Bug 90297 - GPU HANG: ecode 4:0:0xfdfffffb, in Xorg, reason: Ring hung, action: reset
Summary: GPU HANG: ecode 4:0:0xfdfffffb, in Xorg, reason: Ring hung, action: reset
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-04 10:54 UTC by Nick Gazaloff
Modified: 2016-09-28 13:49 UTC (History)
1 user (show)

See Also:
i915 platform: G45
i915 features: GPU hang


Attachments
/sys/class/drm/card0/error (1.61 MB, text/plain)
2015-05-04 10:54 UTC, Nick Gazaloff
no flags Details

Description Nick Gazaloff 2015-05-04 10:54:58 UTC
Created attachment 115529 [details]
/sys/class/drm/card0/error

[263398.820023] [drm] stuck on render ring
[263398.821757] [drm] GPU HANG: ecode 4:0:0xfdfffffb, in Xorg [1296], reason: Ring hung, action: reset
[263398.821760] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[263398.821761] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[263398.821763] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[263398.821764] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[263398.821766] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[263398.821877] drm/i915: Resetting chip after gpu hang


00:02.0 VGA compatible controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03) (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. Device 8336
	Flags: bus master, fast devsel, latency 0, IRQ 27
	Memory at fe400000 (64-bit, non-prefetchable) [size=4M]
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	I/O ports at dc00 [size=8]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [d0] Power Management version 2
	Capabilities: [a4] PCI Advanced Features
	Kernel driver in use: i915
Comment 1 Chris Wilson 2015-05-04 12:24:11 UTC
Missing cacheline in the batchbuffer, GPU dies on the MI_FLUSH following the corrupt command stream.

Can you please double check that the bug is still reproducible with drm-intel-nightly (kernel) and xf86-video-intel.git? Then apply

diff --git a/src/sna/kgem.c b/src/sna/kgem.c
index d190255..de9977c 100644
--- a/src/sna/kgem.c
+++ b/src/sna/kgem.c
@@ -83,7 +83,7 @@ search_snoop_cache(struct kgem *kgem, unsigned int num_pages, 
 #define DBG_NO_FAST_RELOC 0
 #define DBG_NO_HANDLE_LUT 0
 #define DBG_NO_WT 0
-#define DBG_NO_WC_MMAP 0
+#define DBG_NO_WC_MMAP 1
 #define DBG_NO_BLT_Y 0
 #define DBG_NO_SCANOUT_Y 0
 #define DBG_NO_DETILING 0

to xf86-video-intel.

This will hopefully narrow down the issue to the recent mmap(wc).
Comment 2 Ileana 2016-04-19 11:09:26 UTC
Any updates? Is this still an issue?
Comment 3 yann 2016-09-28 13:49:02 UTC
 Timeout. Assuming that it is fixed by now. If this is not the case, please re-test with latest kernel & Mesa to see if this issue is still occurring since there were improvements pushed in kernel and Mesa that will benefit to your system.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.