96452 – [BSW] Screen freezes after updating to 1.18.3 [drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [319], reason: Ring hung, action: reset

Bug 96452 - [BSW] Screen freezes after updating to 1.18.3 [drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [319], reason: Ring hung, action: reset

Summary: [BSW] Screen freezes after updating to 1.18.3 [drm] GPU HANG: ecode 8:0:0xfff...

Status:	CLOSED FIXED

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	unspecified
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium major
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:	drm, gpu hang, ring hung, reset
Keywords:

Depends on:
Blocks:

Reported:	2016-06-09 13:13 UTC by Tobias Predel
Modified:	2017-07-27 14:31 UTC (History)
CC List:	4 users (show)

See Also:
i915 platform:	BSW/CHT
i915 features:	GPU hang

Attachments
more detailed dmesg of another drm crash (18.96 KB, text/plain) 2016-06-10 07:58 UTC, Tobias Predel	no flags	Details
Crash dump file (376.93 KB, text/plain) 2016-06-10 08:01 UTC, Tobias Predel	no flags	Details
sys-class-and-dmesg (696.78 KB, text/plain) 2017-02-13 21:06 UTC, clavi	no flags	Details
hang-every (2.09 KB, text/plain) 2017-02-13 21:07 UTC, clavi	no flags	Details
View All

Description Tobias Predel 2016-06-09 13:13:44 UTC

Hello,

since today the screen constantly freezes after requesting more intensive drawing. My operating system is Arch Linux with kernel 4.4.12-1-lts (does also fail on 4.6.1) and the version of Xorg server is 1.18.3, distribution-specific patchlevel 2 with patch for glamor to initialize correctly (?) (see https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/xorg-server&id=cd174c4fbc38ec316dc6ad7fcb4709516fed8389).

The dmesg from previous boot is applied, gpu crash dump was deleted after reboot.

I hope that this issue is going to be fixed quickly.
Thanks for all your help and work!

Jun 09 14:44:28 localhost kernel: [drm] stuck on render ring
Jun 09 14:44:28 localhost kernel: [drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [319], reason: Ring hung, action: reset
Jun 09 14:44:28 localhost kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jun 09 14:44:28 localhost kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jun 09 14:44:28 localhost kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jun 09 14:44:28 localhost kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jun 09 14:44:28 localhost kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jun 09 14:44:28 localhost kernel: drm/i915: Resetting chip after gpu hang
Jun 09 14:44:36 localhost kernel: [drm] stuck on render ring
Jun 09 14:44:36 localhost kernel: [drm] GPU HANG: ecode 8:0:0xfffffffe, in mupdf [11347], reason: Ring hung, action: reset
Jun 09 14:44:36 localhost kernel: drm/i915: Resetting chip after gpu hang
Jun 09 14:44:44 localhost kernel: [drm] stuck on render ring
Jun 09 14:44:44 localhost kernel: [drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [319], reason: Ring hung, action: reset
Jun 09 14:44:44 localhost kernel: ------------[ cut here ]------------
Jun 09 14:44:44 localhost kernel: WARNING: CPU: 0 PID: 240 at drivers/gpu/drm/i915/intel_display.c:11384 intel_mmio_flip_work_func+0x442/0x4c0 [i915]
Jun 09 14:44:44 localhost kernel: WARN_ON(__i915_wait_request(mmio_flip->req, mmio_flip->crtc->reset_counter, false, ((void *)0), &mmio_flip->i915->rps.mmioflips))
Jun 09 14:44:44 localhost kernel: Modules linked in:
Jun 09 14:44:44 localhost kernel:  fuse snd_hda_codec_hdmi arc4 snd_hda_codec_realtek snd_hda_codec_generic uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core btusb videodev btrtl rndis_host btbcm cdc_ether usbnet btintel media mii bluetooth joydev mousedev iTCO_wdt hid_multitouch iTCO_vendor_support intel_rapl asus_nb_wmi intel_powerclamp coretemp kvm_intel asus_wmi sparse_keymap kvm irqbypass crct10dif_pclmul ath10k_pci crc32_pclmul ath10k_core crc32c_intel ath ghash_clmulni_intel mac80211 nls_iso8859_1 aesni_intel nls_cp437 aes_x86_64 lrw gf128mul glue_helper vfat ablk_helper fat cryptd i915 evdev input_leds mac_hid drm_kms_helper serio_raw pcspkr cfg80211 snd_hda_intel drm intel_gtt snd_hda_codec syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit lpc_ich rfkill snd_hda_core mei_txe snd_hwdep
Jun 09 14:44:44 localhost kernel:  mei shpchp processor_thermal_device i2c_i801 intel_soc_dts_iosf elan_i2c wmi thermal snd_intel_sst_acpi snd_intel_sst_core i2c_hid snd_soc_sst_mfld_platform hid snd_soc_sst_match snd_soc_core snd_compress snd_pcm_dmaengine ac97_bus snd_pcm snd_timer battery snd fjes video ac i2c_designware_platform soundcore i2c_designware_core spi_pxa2xx_platform int3400_thermal tpm_crb acpi_thermal_rel fan int3403_thermal int340x_thermal_zone tpm_tis pinctrl_cherryview tpm asus_wireless processor button sch_fq_codel ip_tables x_tables ext4 crc16 jbd2 mbcache mmc_block atkbd libps2 xhci_pci xhci_hcd usbcore usb_common i8042 serio sdhci_acpi sdhci led_class mmc_core
Jun 09 14:44:44 localhost kernel: CPU: 0 PID: 240 Comm: kworker/0:3 Not tainted 4.6.1-2-ARCH #1
Jun 09 14:44:44 localhost kernel: Hardware name: ASUSTeK COMPUTER INC. E205SA/E205SA, BIOS E205SA.206 09/04/2015
Jun 09 14:44:44 localhost kernel: Workqueue: events intel_mmio_flip_work_func [i915]
Jun 09 14:44:44 localhost kernel:  0000000000000286 00000000cd81051a ffff880078f57d18 ffffffff812e5452
Jun 09 14:44:44 localhost kernel:  ffff880078f57d68 0000000000000000 ffff880078f57d58 ffffffff8107a69b
Jun 09 14:44:44 localhost kernel:  00002c7878f57d20 ffff880078e8cd80 ffff880047667500 ffff8800717573c0
Jun 09 14:44:44 localhost kernel: Call Trace:
Jun 09 14:44:44 localhost kernel:  [<ffffffff812e5452>] dump_stack+0x63/0x81
Jun 09 14:44:44 localhost kernel:  [<ffffffff8107a69b>] __warn+0xcb/0xf0
Jun 09 14:44:44 localhost kernel:  [<ffffffff8107a71f>] warn_slowpath_fmt+0x5f/0x80
Jun 09 14:44:44 localhost kernel:  [<ffffffff8102d76d>] ? __switch_to+0x29d/0x430
Jun 09 14:44:44 localhost kernel:  [<ffffffffa057b4b2>] intel_mmio_flip_work_func+0x442/0x4c0 [i915]
Jun 09 14:44:44 localhost kernel:  [<ffffffff810939d5>] process_one_work+0x1e5/0x480
Jun 09 14:44:44 localhost kernel:  [<ffffffff81093cb8>] worker_thread+0x48/0x4e0
Jun 09 14:44:44 localhost kernel:  [<ffffffff81093c70>] ? process_one_work+0x480/0x480
Jun 09 14:44:44 localhost kernel:  [<ffffffff81093c70>] ? process_one_work+0x480/0x480
Jun 09 14:44:44 localhost kernel:  [<ffffffff81099968>] kthread+0xd8/0xf0
Jun 09 14:44:44 localhost kernel:  [<ffffffff815c74c2>] ret_from_fork+0x22/0x40
Jun 09 14:44:44 localhost kernel:  [<ffffffff81099890>] ? kthread_worker_fn+0x170/0x170
Jun 09 14:44:44 localhost kernel: ---[ end trace 9ae1fd7dd4e57921 ]---
Jun 09 14:44:44 localhost kernel: drm/i915: Resetting chip after gpu hang
Jun 09 14:44:52 localhost kernel: [drm] stuck on render ring
Jun 09 14:44:52 localhost kernel: [drm] GPU HANG: ecode 8:0:0xfffffffe, in Xorg [319], reason: Ring hung, action: reset
Jun 09 14:44:52 localhost kernel: [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
Jun 09 14:44:52 localhost kernel: drm/i915: Resetting chip after gpu hang

Comment 1 Saul 2016-06-10 02:49:17 UTC

Same issue here:

[ 1108.779048] [drm] stuck on render ring 
[ 1108.786056] [drm] GPU HANG: ecode 8:0:0xfffffffe, in X [1897], reason: Ring hung, action: reset 
[ 1108.786063] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. 
[ 1108.786067] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel 
[ 1108.786071] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. 
[ 1108.786074] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. 
[ 1108.786078] [drm] GPU crash dump saved to /sys/class/drm/card0/error 
[ 1108.787820] drm/i915: Resetting chip after gpu hang 
[ 1116.779124] [drm] stuck on render ring 
[ 1116.786003] [drm] GPU HANG: ecode 8:0:0xfffffffe, in X [1897], reason: Ring hung, action: reset 
[ 1116.786435] [drm:i915_set_reset_status] *ERROR* gpu hanging too fast, banning! 
[ 1116.788816] drm/i915: Resetting chip after gpu hang

Comment 2 Tobias Predel 2016-06-10 07:58:03 UTC

Created attachment 124440 [details]
more detailed dmesg of another drm crash

Managed to extract dmesg of another freeze while staying in tty

Comment 3 Tobias Predel 2016-06-10 08:01:25 UTC

Created attachment 124441 [details]
Crash dump file

Comment 4 Saul 2016-06-10 22:00:07 UTC

The problem was kernel 4.6.0 for me, downgrading made the problem go away.

Comment 5 clavi 2017-02-13 21:06:38 UTC

Created attachment 129565 [details]
sys-class-and-dmesg

Same problem here.

kernel 4.4.46 (custom config) + slackware-14.2, 64bit + latest driver (compiled from source) since the shipped was freezing very often. Now it freezes after X days.

Comment 6 clavi 2017-02-13 21:07:03 UTC

Created attachment 129566 [details]
hang-every

Comment 7 Elizabeth 2017-07-26 20:38:47 UTC

Hello Tobias, Saul, Clavi,
Sorry for the way to long delay in this bug, is this problem still valid on the latest kernel versions https://www.kernel.org/? The latest kernels mentioned on comments #4 and #5 are quite old and a lot of changes have been made since those. The problem could have been fixed already on the newest versions.
Thank you.

Comment 8 Saul 2017-07-26 20:52:36 UTC

Sorry, I even do not remember which one of my boxes had this issue. No problems with my current hardware.

Comment 9 Elizabeth 2017-07-26 21:56:11 UTC

(In reply to Saul from comment #8)
> Sorry, I even do not remember which one of my boxes had this issue. No
> problems with my current hardware.

Thanks for the information Saul.
I'll just wait for the answer of Tobias and Clavi then, if not I'll have to close this bug.

Comment 10 Tobias Predel 2017-07-27 11:19:18 UTC

I report no more freezes. Currently I'm using GNOME with Wayland.

Regards,
Tobias

On Wed, Jul 26, 2017 at 09:56:11PM +0000, bugzilla-daemon@freedesktop.org wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=96452
> 
> --- Comment #9 from Elizabeth <elizabethx.de.la.torre.mena@intel.com> ---
> (In reply to Saul from comment #8)
> > Sorry, I even do not remember which one of my boxes had this issue. No
> > problems with my current hardware.
> 
> Thanks for the information Saul.
> I'll just wait for the answer of Tobias and Clavi then, if not I'll have to
> close this bug.
> 
> -- 
> You are receiving this mail because:
> You reported the bug.

Comment 11 Elizabeth 2017-07-27 14:31:24 UTC

(In reply to Tobias Predel from comment #10)
> I report no more freezes. Currently I'm using GNOME with Wayland.
> 
> Regards,
> Tobias
> ...
Thank you for the information Tobias,
Then I will proceed to close the bug since you are the original submitter.
To Clavi, if you keep having problems please open a new bug with HW and SW information and logs attached. Thank you.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.