Created attachment 120043 [details] GPU crash dump from /sys/class/drm/card0/error I've just experienced another momentary blankness followed by a scrambled screen. Forcing windows to redraw often helps restore order, but scrolling vim windows and things often brings back the scrambled text, and highlighting articles in claws-mail also scrambles them across the screen (including outside the claws-mail window). My web browser seems largely immune to scrambling problems, once I've refreshed the initially scrambled window. This happens on average once a week, and usually as I'm just winding up work, but I've not identified a specific time yet. It's been happening for at least a few months now, and even after a number of xorg and mesa updates, it's no better or worse than it first was. Here's my dmesg: [25644.997531] [drm] stuck on render ring [25645.004709] [drm] GPU HANG: ecode 3:0:0x6affbfc1, in Xorg [295], reason: Ring hung, action: reset [25645.004720] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [25645.004725] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [25645.004730] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [25645.004735] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [25645.004740] [drm] GPU crash dump saved to /sys/class/drm/card0/error [25645.004848] ------------[ cut here ]------------ [25645.004912] WARNING: CPU: 0 PID: 3846 at drivers/gpu/drm/i915/intel_display.c:3291 intel_crtc_wait_for_pending_flips+0x16c/0x200 [i915]() [25645.004918] WARN_ON(ret) [25645.004923] Modules linked in: [25645.004929] sha256_generic hmac drbg ansi_cprng ctr ccm joydev mousedev iTCO_wdt iTCO_vendor_support uvcvideo videobuf2_vmalloc videobuf2_memops arc4 videobuf2_core v4l2_common videodev rt2800pci rt2800mmio media rt2800lib coretemp evdev uas rt2x00pci input_leds rt2x00mmio mac_hid rt2x00lib pcspkr serio_raw psmouse i915 mac80211 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core cfg80211 i2c_i801 snd_hwdep snd_pcm lpc_ich rng_core drm_kms_helper snd_timer eeprom_93cx6 crc_ccitt atl1e thermal drm snd eeepc_laptop soundcore sparse_keymap battery intel_agp intel_gtt led_class agpgart rfkill shpchp ac i2c_algo_bit video button acpi_cpufreq processor sch_fq_codel ip_tables x_tables ext4 crc16 mbcache jbd2 ata_generic pata_acpi sd_mod usb_storage atkbd libps2 ata_piix [25645.005092] libata ehci_pci uhci_hcd ehci_hcd scsi_mod usbcore usb_common i8042 serio [25645.005120] CPU: 0 PID: 3846 Comm: kworker/u4:1 Not tainted 4.2.5-1-ARCH #1 [25645.005128] Hardware name: ASUSTeK Computer INC. 901/901, BIOS 2103 06/11/2009 [25645.005180] Workqueue: i915-hangcheck i915_hangcheck_elapsed [i915] [25645.005190] c1631967 b8222fb0 00000000 f2c97d50 c14c8e8d f2c97d90 f2c97d80 c1058457 [25645.005210] f874675a f2c97db0 00000f06 f8751b28 00000cdb f86edb1c f86edb1c f4fa0000 [25645.005229] f4df1034 00000001 f2c97d9c c10584ce 00000009 f2c97d90 f874675a f2c97db0 [25645.005247] Call Trace: [25645.005265] [<c14c8e8d>] dump_stack+0x48/0x69 [25645.005277] [<c1058457>] warn_slowpath_common+0x87/0xc0 [25645.005335] [<f86edb1c>] ? intel_crtc_wait_for_pending_flips+0x16c/0x200 [i915] [25645.005389] [<f86edb1c>] ? intel_crtc_wait_for_pending_flips+0x16c/0x200 [i915] [25645.005402] [<c10584ce>] warn_slowpath_fmt+0x3e/0x60 [25645.005456] [<f86edb1c>] intel_crtc_wait_for_pending_flips+0x16c/0x200 [i915] [25645.005484] [<f82ddba4>] ? drm_modeset_lock_all_crtcs+0x84/0x90 [drm] [25645.005540] [<f86eeea4>] intel_crtc_disable_planes+0x34/0xf0 [i915] [25645.005593] [<f86ef00a>] intel_prepare_reset+0x6a/0x80 [i915] [25645.005641] [<f86c2877>] i915_handle_error+0x147/0x6e0 [i915] [25645.005657] [<c10ad457>] ? vprintk_default+0x37/0x40 [25645.005705] [<f86c3093>] i915_hangcheck_elapsed+0x233/0x410 [i915] [25645.005719] [<c106da4a>] process_one_work+0x11a/0x3f0 [25645.005730] [<c106dd57>] worker_thread+0x37/0x470 [25645.005740] [<c106dd20>] ? process_one_work+0x3f0/0x3f0 [25645.005750] [<c1072df6>] kthread+0xa6/0xc0 [25645.005761] [<c1079ff5>] ? finish_task_switch+0x55/0x190 [25645.005773] [<c14cddc1>] ret_from_kernel_thread+0x21/0x30 [25645.005783] [<c1072d50>] ? kthread_worker_fn+0x140/0x140 [25645.005792] ---[ end trace ae2bba7cddb23771 ]--- [25645.280403] drm/i915: Resetting chip after gpu hang My crash dump is attached.
Just happened again, so I don't think clock time or uptime are relevant. Stack trace is identical, but this happened on CPU 1 (single core machine, but hyperthreading makes it appear as 0 and 1), and unsurprisingly the PID was different. I've taken a copy of the crash dump, which I can attach if anyone needs it.
This happened again last evening. Very nearly a month since the last crash!
And again, same traceback, same work queue. I've no idea what I'm doing different on the days that it does fail compared to the days where I can get 18+ hours of uptime without it failing.
*** This bug has been marked as a duplicate of bug 90841 ***
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.