Bug 92262 - [HSW] GPU HANG: ecode 7:0:0x85dffffc, in chromium [6879], reason: Ring hung, action: reset
Summary: [HSW] GPU HANG: ecode 7:0:0x85dffffc, in chromium [6879], reason: Ring hung, ...
Status: RESOLVED INVALID
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: 11.0
Hardware: All All
: medium normal
Assignee: Ian Romanick
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-02 20:53 UTC by Alexander Mezin
Modified: 2017-02-10 22:38 UTC (History)
4 users (show)

See Also:
i915 platform: HSW
i915 features: GPU hang


Attachments
/sys/class/drm/card0/error (gzipped) (482.98 KB, text/plain)
2015-10-02 20:53 UTC, Alexander Mezin
Details
dmesg (68.53 KB, text/plain)
2015-10-02 20:54 UTC, Alexander Mezin
Details

Description Alexander Mezin 2015-10-02 20:53:43 UTC
Created attachment 118639 [details]
/sys/class/drm/card0/error (gzipped)

[19184.993278] [drm] stuck on render ring
[19184.994564] [drm] GPU HANG: ecode 7:0:0x85dffffc, in chromium [6879], reason: Ring hung, action: reset
[19184.994569] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[19184.994571] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[19184.994573] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[19184.994575] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[19184.994577] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[19184.996821] drm/i915: Resetting chip after gpu hang

Linux 4.1.8
mesa 11.0.2
chromium 45.0.2454.101

00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 09)
Intel HD 5000 on Mac Mini late 2014
Comment 1 Alexander Mezin 2015-10-02 20:54:29 UTC
Created attachment 118640 [details]
dmesg
Comment 2 slavko.glamocanin 2015-10-11 07:23:52 UTC
Same here, not allowed to attach crash dump tho.
Comment 3 Pas 2016-03-20 23:52:41 UTC
Hello!

I also experienced lockups with chromium, but sometimes with kwin too. (But I guess it can happen to any process that uses X.)

márc 20 01:41:18 stranger kernel: [drm] stuck on render ring
márc 20 01:41:18 stranger kernel: [drm] GPU HANG: ecode 7:0:0x85dffff8, in Xorg [1145], reason: Ring hung, action: reset
márc 20 01:41:18 stranger kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
márc 20 01:41:18 stranger kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
márc 20 01:41:18 stranger kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
márc 20 01:41:18 stranger kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
márc 20 01:41:18 stranger kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
márc 20 01:41:18 stranger kernel: ------------[ cut here ]------------
márc 20 01:41:18 stranger kernel: WARNING: CPU: 0 PID: 24944 at /build/linux-chsvUo/linux-4.4.0/drivers/gpu/drm/i915/intel_display.c:11287 intel_mmio_flip_work_func+0x38e/0x3d0 [i915]()
márc 20 01:41:18 stranger kernel: WARN_ON(__i915_wait_request(mmio_flip->req, mmio_flip->crtc->reset_counter, false, NULL, &mmio_flip->i915->rps.mmioflips))
márc 20 01:41:18 stranger kernel: Modules linked in:
márc 20 01:41:18 stranger kernel:  ctr ccm rfcomm pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE nf_nat_masquerade
márc 20 01:41:18 stranger systemd-udevd[433]: seq 3915 queued, 'change' 'drm'
márc 20 01:41:18 stranger kernel:  iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables x_tables nf_nat nf_conntrack br_netfilter bridge stp llc overlay bnep binfmt_misc arc4 btusb btrtl btbcm btintel bluetooth pn544_mei mei_phy pn
márc 20 01:41:18 stranger kernel:  snd_seq 8250_fintek snd_seq_device snd_timer dw_dmac dell_smo8800 dw_dmac_core snd snd_soc_sst_acpi elan_i2c soundcore 8250_dw i2c_designware_platform i2c_designware_core dell_rbtn spi_pxa2xx_platform mac_hid kvm_intel
márc 20 01:41:18 stranger systemd-udevd[433]: Validate module index
márc 20 01:41:18 stranger kernel:  kvm
márc 20 01:41:18 stranger kernel:  irqbypass nfsd auth_rpcgss parport_pc nfs_acl lockd ppdev grace sunrpc lp parport autofs4 jitterentropy_rng drbg ansi_cprng algif_skcipher af_alg dm_crypt crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 i915 lrw gf128mul glue_helper
márc 20 01:41:18 stranger systemd-udevd[433]: Check if link configuration needs reloading.
márc 20 01:41:18 stranger kernel:  fb_sys_fops
márc 20 01:41:18 stranger kernel:  drm ptp pps_core wmi sdhci_acpi video sdhci i2c_hid hid fjes
márc 20 01:41:18 stranger kernel: CPU: 0 PID: 24944 Comm: kworker/0:3 Tainted: G           OE   4.4.0-12-generic #28-Ubuntu
márc 20 01:41:18 stranger kernel: Hardware name: Dell Inc. Latitude E7440/0RYCC9, BIOS A09 05/01/2014
márc 20 01:41:18 stranger kernel: Workqueue: events intel_mmio_flip_work_func [i915]
márc 20 01:41:18 stranger kernel:  0000000000000286 000000006f744324 ffff880179093d20 ffffffff813e1ec3
márc 20 01:41:18 stranger kernel:  ffff880179093d68 ffffffffc02daa50 ffff880179093d58 ffffffff8107fe12
márc 20 01:41:18 stranger kernel:  ffff8801c6ef3800 ffff88021ea16540 ffff88021ea1ae00 0000000000000000
márc 20 01:41:18 stranger kernel: Call Trace:
márc 20 01:41:18 stranger kernel:  [<ffffffff813e1ec3>] dump_stack+0x63/0x90
márc 20 01:41:18 stranger kernel:  [<ffffffff8107fe12>] warn_slowpath_common+0x82/0xc0
márc 20 01:41:18 stranger kernel:  [<ffffffff8107feac>] warn_slowpath_fmt+0x5c/0x80
márc 20 01:41:18 stranger kernel:  [<ffffffff8101666c>] ? __switch_to+0x1dc/0x5a0
márc 20 01:41:18 stranger kernel:  [<ffffffffc0273c8e>] intel_mmio_flip_work_func+0x38e/0x3d0 [i915]
márc 20 01:41:18 stranger kernel:  [<ffffffff81098eb2>] process_one_work+0x162/0x480
márc 20 01:41:18 stranger kernel:  [<ffffffff8109921b>] worker_thread+0x4b/0x4c0
márc 20 01:41:18 stranger kernel:  [<ffffffff810991d0>] ? process_one_work+0x480/0x480
márc 20 01:41:18 stranger kernel:  [<ffffffff810991d0>] ? process_one_work+0x480/0x480
márc 20 01:41:18 stranger kernel:  [<ffffffff8109f3e8>] kthread+0xd8/0xf0
márc 20 01:41:18 stranger kernel:  [<ffffffff8109f310>] ? kthread_create_on_node+0x1e0/0x1e0
márc 20 01:41:18 stranger kernel:  [<ffffffff8181cbcf>] ret_from_fork+0x3f/0x70
márc 20 01:41:18 stranger kernel:  [<ffffffff8109f310>] ? kthread_create_on_node+0x1e0/0x1e0
márc 20 01:41:18 stranger kernel: ---[ end trace 5463c319c64331d4 ]---



... grepping for -i reset

márc 20 19:15:23 stranger kernel: [drm] GPU HANG: ecode 7:0:0x85dffff8, in kscreenlocker_g [29226], reason: Ring hung, action: reset
márc 20 19:15:23 stranger kernel: WARN_ON(__i915_wait_request(mmio_flip->req, mmio_flip->crtc->reset_counter, false, NULL, &mmio_flip->i915->rps.mmioflips))
márc 20 19:15:23 stranger kernel: drm/i915: Resetting chip after gpu hang
márc 20 19:15:29 stranger kernel: [drm] GPU HANG: ecode 7:0:0x85dffff8, in kscreenlocker_g [29226], reason: Ring hung, action: reset
márc 20 19:15:29 stranger kernel: drm/i915: Resetting chip after gpu hang
márc 21 00:29:57 stranger kernel: [drm] GPU HANG: ecode 7:0:0x85dffff8, in krunner [25477], reason: Ring hung, action: reset
márc 21 00:29:57 stranger kernel: drm/i915: Resetting chip after gpu hang
márc 21 00:30:03 stranger kernel: [drm] GPU HANG: ecode 7:0:0x85dffff8, in krunner [25477], reason: Ring hung, action: reset
márc 21 00:30:03 stranger kernel: drm/i915: Resetting chip after gpu hang



Linux stranger 4.4.0-12-generic #28-Ubuntu SMP Wed Mar 9 00:33:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

libdrm-intel1:amd64                                         2.4.67+git1603030630.ea07de~gd~w 


00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)

Dell Latitude E7440


https://transfer.sh/11YN0n/nqd3824qr4-card0.error.gz
Comment 4 yann 2016-09-22 08:43:51 UTC
We seem to have neglected the bug a bit, apologies.

There were improvements pushed in kernel and Mesa that will benefit to your system, so please re-test with latest kernel & Mesa to see if this issue is still occurring.

In parallel, assigning to Mesa product (please let me know if I am mistaken with this GPU Hang).

Kernel: 4.1.8-1-lts
Platform: Haswell ULT (pci id: 0x0a26)
Mesa: 11.0.2

From this error dump, hung is happening in render ring batch with active head at 0x0070d700, with 0x7a000003 (PIPE_CONTROL) as IPEHR.

We can note also:
ERROR: 0x00000105
    TLB page fault error (GTT entry not valid)
    Invalid page directory entry error
    Cacheline containing a PD was marked as invalid


Batch extract (around 0x0070d700):

0x0070d6d0:      0x7b000005: 3DPRIMITIVE:
0x0070d6d4:      0x0000000f:    rect list sequential
0x0070d6d8:      0x00000003:    vertex count
0x0070d6dc:      0x00000000:    start vertex
0x0070d6e0:      0x00000001:    instance count
0x0070d6e4:      0x00000000:    start instance
0x0070d6e8:      0x00000000:    index bias
0x0070d6ec:      0x7a000003: PIPE_CONTROL
0x0070d6f0:      0x00101c11:    no write, cs stall, render target cache flush, instruction cache invalidate, texture cache invalidate, vf fetch invalidate, depth cache flush,
0x0070d6f4:      0x00000000:    destination address
0x0070d6f8:      0x00000000:    immediate dword low
0x0070d6fc:      0x00000000:    immediate dword high
0x0070d700:      0x78210000: 3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP
0x0070d704:      0x00006f40:    pointer to SF_CLIP viewport
0x0070d708:      0x78240000: 3DSTATE_BLEND_STATE_POINTERS
0x0070d70c:      0x00006f01:    pointer to BLEND_STATE at 0x00006f00 (changed)
Comment 5 yann 2016-11-04 15:45:15 UTC
Please test a new version of Mesa (12 or 13) and mark as REOPENED
if you can reproduce and RESOLVED/* if you cannot reproduce.
Comment 6 Annie 2017-02-10 22:38:35 UTC
Dear Reporter,

This Mesa bug has been in the "NEEDINFO" status for over 60 days. I am closing this bug based on lack of response but feel free to reopen if resolution is still needed. Please ensure you're supplying the correct information as requested.

Thank you.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.