Created attachment 107080 [details] /sys/class/drm/card0/error Kernel Linux 3.16.3-gnu; distribution Parabola GNU/Linux (xf86-video-intel 2.99.916-2, xorg-server-libre 1.16.0-6, mesa 10.2.8-1, libdrm 2.4.56-1). From the log: kernel: Linux agpgart interface v0.103 kernel: agpgart-intel 0000:00:00.0: Intel 845G Chipset kernel: agpgart-intel 0000:00:00.0: detected gtt size: 131072K total, 131072K mappable kernel: agpgart-intel 0000:00:00.0: detected 512K stolen memory kernel: agpgart-intel 0000:00:00.0: AGP aperture is 128M @ 0xe0000000 kernel: [drm] Initialized drm 1.1.0 20060810 kernel: [drm] Memory usable by graphics device = 128M kernel: [drm] Replacing VGA console driver kernel: Console: switching to colour dummy device 80x25 kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). kernel: [drm] Driver supports precise vblank timestamp query. kernel: i915 0000:00:02.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment) kernel: [drm] failed to find VBIOS tables kernel: vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem kernel: [drm] initialized overlay support kernel: [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 255 kernel: Raw EDID: [...] kernel: [drm] Got external EDID base block and 0 extensions from "edid/edid.bin" for connector "VGA-1" kernel: i915 0000:00:02.0: fb0: inteldrmfb frame buffer device kernel: i915 0000:00:02.0: registered panic notifier kernel: [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0 and: kernel: [drm] stuck on render ring kernel: [drm] GPU HANG: ecode 0:0x422b7fc1, in Xorg.bin [131], reason: Ring hung, action: reset kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error kernel: [drm:i915_reset] *ERROR* Failed to reset chip: -19 and then (like bug 82095): kernel: [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun [...] kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Command parser error, iir 0x00008000, action: continue kernel: i915: render error detected, EIR: 0x00000010 kernel: [drm:i915_report_and_clear_eir] *ERROR* EIR stuck: 0x00000010, masking kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Command parser error, iir 0x00008000, action: continue kernel: i915: render error detected, EIR: 0x00000010 It happens at random.
Created attachment 107081 [details] Xorg log
Should be fixed with commit c4d69da167fa967749aeb70bc0e94a457e5d00c1 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Sep 8 14:25:41 2014 +0100 drm/i915: Evict CS TLBs between batches Running igt, I was encountering the invalid TLB bug on my 845g, despite that it was using the CS workaround. Examining the w/a buffer in the error state, showed that the copy from the user batch into the workaround itself was suffering from the invalid TLB bug (the first cacheline was broken with the first two words reversed). Time to try a fresh approach. This extends the workaround to write into each page of our scratch buffer in order to overflow the TLB and evict the invalid entries. This could be refined to only do so after we update the GTT, but for simplicity, we do it before each batch. I suspect this supersedes our current workaround, but for safety keep doing both. v2: The magic number shall be 2. This doesn't conclusively prove that it is the mythical TLB bug we've been trying to workaround for so long, that it requires touching a number of pages to prevent the corruption indicates to me that it is TLB related, but the corruption (the reversed cacheline) is more subtle than a TLB bug, where we would expect it to read the wrong page entirely. Oh well, it prevents a reliable hang for me and so probably for others as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> I believe.
Likely fixed, closing the report...
Closing resolved after a year.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.