Bug 90841 - [gen3 4.0] mmap(wc) batch incoherence
Summary: [gen3 4.0] mmap(wc) batch incoherence
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 89334 90203 90726 90919 91140 taro 91329 91340 91581 92109 92130 92322 92686 92732 92879 92898 92938 92950 92964 93074 93432 93457 94327 94718 95461 98613 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-06-04 09:07 UTC by Xose Vazquez Perez
Modified: 2016-11-06 17:32 UTC (History)
28 users (show)

See Also:
i915 platform: I945GM
i915 features: GEM/Other, GPU hang


Attachments
GPU crash dump (759.31 KB, text/plain)
2015-06-04 09:07 UTC, Xose Vazquez Perez
no flags Details

Description Xose Vazquez Perez 2015-06-04 09:07:49 UTC
Created attachment 116281 [details]
GPU crash dump

Fedora 22
kernel 4.0.4-303
xorg-x11-drv-intel 2.99.917-10.20150526

[11330.708072] [drm] stuck on render ring
[11330.709226] [drm] GPU HANG: ecode 3:0:0x74ffffc1, in systemd-logind [489], reason: Ring hung, action: reset
[11330.709229] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[11330.709231] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[11330.709233] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[11330.709235] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[11330.709238] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Comment 1 Chris Wilson 2015-06-09 21:41:47 UTC
*** Bug 90919 has been marked as a duplicate of this bug. ***
Comment 2 Chris Wilson 2015-06-09 21:55:05 UTC
If you do get the opportunity, testing a kernel built from

http://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=nightly

i.e. the nightly branch of git://people.freedesktop.org/~ickle/linux-2.6
would be very useful - though the base drm-intel-nightly has a few modesetting bugs atm :|
Comment 4 Jim Haynes 2015-06-20 03:32:36 UTC
I don't know how to get a copy of that kernel.
Comment 5 Jim Haynes 2015-06-20 03:33:55 UTC
I have another crash dump, but I guess if the bug is fixed you don't want it
and I just need to wait for the fix to filter down through Fedora.
Comment 6 Jim Haynes 2015-06-22 02:09:38 UTC
And, in Fedora 22, my screen corruption on 32 bit Dell SX280 is still
there after kernel-4.0.5-300.fc22 and xorg-x11-drv-intel-2.99.917-12.20150615.fc22
Comment 7 Jim Haynes 2015-06-30 00:38:48 UTC
My problem may be fixed with kernel 4.0.6-300.fc22
and xorg-x11-drv-intel-2.99.917-12.20150615.fc22.

At least I ran for a lot longer than usual without screen corruption after
these updates.
Comment 8 Chris Wilson 2015-07-03 07:43:21 UTC
*** Bug 91201 has been marked as a duplicate of this bug. ***
Comment 9 Chris Wilson 2015-07-14 07:27:57 UTC
*** Bug 91329 has been marked as a duplicate of this bug. ***
Comment 10 Chris Wilson 2015-07-15 06:47:20 UTC
*** Bug 91340 has been marked as a duplicate of this bug. ***
Comment 11 Chris Wilson 2015-07-29 21:05:57 UTC
*** Bug 91140 has been marked as a duplicate of this bug. ***
Comment 12 Chris Wilson 2015-07-29 21:10:00 UTC
*** Bug 90726 has been marked as a duplicate of this bug. ***
Comment 13 Chris Wilson 2015-08-07 20:22:42 UTC
*** Bug 91581 has been marked as a duplicate of this bug. ***
Comment 14 Chris Wilson 2015-12-30 21:43:18 UTC
*** Bug 92130 has been marked as a duplicate of this bug. ***
Comment 15 Chris Wilson 2015-12-30 21:43:45 UTC
*** Bug 92938 has been marked as a duplicate of this bug. ***
Comment 16 Chris Wilson 2015-12-30 21:43:51 UTC
*** Bug 92964 has been marked as a duplicate of this bug. ***
Comment 17 Chris Wilson 2015-12-30 21:43:59 UTC
*** Bug 92950 has been marked as a duplicate of this bug. ***
Comment 18 Chris Wilson 2015-12-30 21:44:07 UTC
*** Bug 92898 has been marked as a duplicate of this bug. ***
Comment 19 Chris Wilson 2015-12-30 21:44:38 UTC
*** Bug 92109 has been marked as a duplicate of this bug. ***
Comment 20 Chris Wilson 2015-12-30 21:44:53 UTC
*** Bug 92686 has been marked as a duplicate of this bug. ***
Comment 21 Chris Wilson 2015-12-30 21:45:12 UTC
*** Bug 92322 has been marked as a duplicate of this bug. ***
Comment 22 Chris Wilson 2015-12-31 08:28:37 UTC
*** Bug 93074 has been marked as a duplicate of this bug. ***
Comment 23 Chris Wilson 2015-12-31 09:23:21 UTC
*** Bug 93457 has been marked as a duplicate of this bug. ***
Comment 24 Chris Wilson 2015-12-31 09:24:22 UTC
*** Bug 93432 has been marked as a duplicate of this bug. ***
Comment 25 Chris Wilson 2015-12-31 09:25:14 UTC
*** Bug 92879 has been marked as a duplicate of this bug. ***
Comment 26 Laurent Bonnaud 2016-02-03 16:47:13 UTC
Hi,

I see that this bug title is marked "gen3" AKA IvyBridge.  I just got a similar error on a SandyBridge processor (i7-2640M).

Feb  3 15:02:48 vougeot kernel: [96115.293750] [drm] stuck on render ring
Feb  3 15:02:48 vougeot kernel: [96115.294058] [drm] GPU HANG: ecode 6:0:0xf288fff9, in Xorg [22791], reason: Ring hung, action: reset
Feb  3 15:02:48 vougeot kernel: [96115.294098] ------------[ cut here ]------------
Feb  3 15:02:48 vougeot kernel: [96115.294122] WARNING: CPU: 1 PID: 22791 at /home/kernel/COD/linux/drivers/gpu/drm/i915/intel_display.c:3332 intel_crtc_wait_for_pending_flips+0x1a0/0x240 [i915]()
Feb  3 15:02:48 vougeot kernel: [96115.294123] WARN_ON(ret)
Feb  3 15:02:48 vougeot kernel: [96115.294124] Modules linked in: nls_iso8859_1 uas usb_storage xt_CHECKSUM iptable_mangle xt_tcpudp rfcomm xt_addrtype xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_fil
ter ip_tables x_tables nf_nat nf_conntrack bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c binfmt_misc bnep snd_hda_codec_hdmi jitterentropy_rng drbg ansi_cprng snd_hda_codec_idt dm_crypt snd_hda_codec_generic dell_wmi sparse_keymap dell_r
btn snd_hda_intel dell_laptop snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event dcdbas dell_smm_hwmon snd_rawmidi intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp iwlmvm crct10dif_pclmul snd_seq btusb mac80211 btrtl crc32_pclmul btbcm 
btintel bluetooth iwlwifi snd_seq_device aesni_intel snd_timer aes_x86_64 snd lrw gf128mul input_leds glue_helper cfg80211 ablk_helper cryptd joydev serio_raw mei_me shpchp mei soundcore lpc_ich 8250_fintek dell_smo8800 mac_hid kvm_intel kvm irqbypass tpm_rng parport_pc 
ppdev lp parport autofs4 btrfs xor raid6_pq hid_generic usbhid hid i915 i2c_algo_bit firewire_ohci psmouse drm_kms_helper syscopyarea sysfillrect sysimgblt firewire_core fb_sys_fops e1000e crc_itu_t ahci drm sdhci_pci libahci sdhci ptp pps_core wmi fjes video
Feb  3 15:02:48 vougeot kernel: [96115.294178] CPU: 1 PID: 22791 Comm: Xorg Not tainted 4.4.1-040401-generic #201601311534
Feb  3 15:02:48 vougeot kernel: [96115.294179] Hardware name: Dell Inc. Latitude E6520/0NVF5K, BIOS A19 11/14/2013
Feb  3 15:02:48 vougeot kernel: [96115.294180]  0000000000000000 00000000593fc8c3 ffff8801dfc3baa0 ffffffff813c8e14
Feb  3 15:02:48 vougeot kernel: [96115.294182]  ffff8801dfc3bae8 ffff8801dfc3bad8 ffffffff8107dba2 ffff8800359d0000
Feb  3 15:02:48 vougeot kernel: [96115.294183]  0000000000000001 ffff88021f56d000 ffff880035867060 0000000000000001
Feb  3 15:02:48 vougeot kernel: [96115.294185] Call Trace:
Feb  3 15:02:48 vougeot kernel: [96115.294190]  [<ffffffff813c8e14>] dump_stack+0x44/0x60
Feb  3 15:02:48 vougeot kernel: [96115.294192]  [<ffffffff8107dba2>] warn_slowpath_common+0x82/0xc0
Feb  3 15:02:48 vougeot kernel: [96115.294194]  [<ffffffff8107dc3c>] warn_slowpath_fmt+0x5c/0x80
Feb  3 15:02:48 vougeot kernel: [96115.294206]  [<ffffffffc022257d>] ? i915_gem_object_wait_rendering+0x6d/0xc0 [i915]
Feb  3 15:02:48 vougeot kernel: [96115.294223]  [<ffffffffc0262360>] intel_crtc_wait_for_pending_flips+0x1a0/0x240 [i915]
Feb  3 15:02:48 vougeot kernel: [96115.294239]  [<ffffffffc0263611>] intel_pre_plane_update+0x111/0x140 [i915]
Feb  3 15:02:48 vougeot kernel: [96115.294254]  [<ffffffffc0263de2>] intel_atomic_commit+0x352/0x6f0 [i915]
Feb  3 15:02:48 vougeot kernel: [96115.294271]  [<ffffffffc008651e>] ? drm_atomic_check_only+0x18e/0x590 [drm]
Feb  3 15:02:48 vougeot kernel: [96115.294283]  [<ffffffffc0086957>] drm_atomic_commit+0x37/0x60 [drm]
Feb  3 15:02:48 vougeot kernel: [96115.294292]  [<ffffffffc0183cb9>] drm_atomic_helper_disable_plane+0xa9/0xf0 [drm_kms_helper]
Feb  3 15:02:48 vougeot kernel: [96115.294302]  [<ffffffffc008509e>] ? drm_modeset_lock+0x4e/0xd0 [drm]
Feb  3 15:02:48 vougeot kernel: [96115.294310]  [<ffffffffc0075f39>] __setplane_internal+0x169/0x250 [drm]
Feb  3 15:02:48 vougeot kernel: [96115.294319]  [<ffffffffc00851b0>] ? drm_modeset_lock_all_crtcs+0x90/0xa0 [drm]
Feb  3 15:02:48 vougeot kernel: [96115.294329]  [<ffffffffc0079c36>] drm_mode_setplane+0x136/0x1b0 [drm]
Feb  3 15:02:48 vougeot kernel: [96115.294336]  [<ffffffffc006b722>] drm_ioctl+0x152/0x540 [drm]
Feb  3 15:02:48 vougeot kernel: [96115.294347]  [<ffffffffc0079b00>] ? drm_plane_check_pixel_format+0x50/0x50 [drm]
Feb  3 15:02:48 vougeot kernel: [96115.294354]  [<ffffffff812196e8>] do_vfs_ioctl+0x298/0x480
Feb  3 15:02:48 vougeot kernel: [96115.294358]  [<ffffffff81208871>] ? __sb_end_write+0x21/0x30
Feb  3 15:02:48 vougeot kernel: [96115.294361]  [<ffffffff812064dd>] ? vfs_write+0x15d/0x1a0
Feb  3 15:02:48 vougeot kernel: [96115.294365]  [<ffffffff81219949>] SyS_ioctl+0x79/0x90
Feb  3 15:02:48 vougeot kernel: [96115.294368]  [<ffffffff817fdbb6>] entry_SYSCALL_64_fastpath+0x16/0x75
Feb  3 15:02:48 vougeot kernel: [96115.294370] ---[ end trace 74eb8c71821ad9ca ]---
Comment 27 Chris Wilson 2016-02-03 16:51:55 UTC
(In reply to Laurent Bonnaud from comment #26)
> Hi,
> 
> I see that this bug title is marked "gen3" AKA IvyBridge.  I just got a
> similar error on a SandyBridge processor (i7-2640M).

Nope, gen3 here means the GPU generation, Sandybridge is gen6, gen3 is a good few years older. It is impossible to deduce from the dmesg alone what the hang is, you need to inspect the error state as well
Comment 28 Chris Wilson 2016-03-02 16:09:11 UTC
*** Bug 94327 has been marked as a duplicate of this bug. ***
Comment 29 Chris Wilson 2016-08-19 09:23:40 UTC
*** Bug 90203 has been marked as a duplicate of this bug. ***
Comment 30 Chris Wilson 2016-08-19 09:23:48 UTC
*** Bug 92732 has been marked as a duplicate of this bug. ***
Comment 31 Chris Wilson 2016-08-19 09:23:58 UTC
*** Bug 94718 has been marked as a duplicate of this bug. ***
Comment 32 Chris Wilson 2016-08-19 09:24:04 UTC
*** Bug 95461 has been marked as a duplicate of this bug. ***
Comment 33 Chris Wilson 2016-08-19 09:24:11 UTC
*** Bug 89334 has been marked as a duplicate of this bug. ***
Comment 34 Chris Wilson 2016-08-19 09:25:01 UTC
commit 600f436801deae65e48404847b61c89b4944e355
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Aug 18 17:16:40 2016 +0100

    drm/i915: Unconditionally flush any chipset buffers before execbuf
    
    If userspace is asynchronously streaming into the batch or other
    execobjects, we may not flush those writes along with a change in cache
    domain (as there is no change). Therefore those writes may end up in
    internal chipset buffers and not visible to the GPU upon execution. We
    must issue a flush command or otherwise we encounter incoherency in the
    batchbuffers and the GPU executing invalid commands (i.e. hanging) quite
    regularly.
Comment 35 Chris Wilson 2016-11-06 17:32:20 UTC
*** Bug 98613 has been marked as a duplicate of this bug. ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.