Bug 110937

Summary: [CI][DRMTIP] igt@gem_persistent_relocs@forked-(interruptible-)?faulting-reloc-thrash-inactive - incomplete - list_del corruption
Product: DRI Reporter: Lakshmi <lakshminarayana.vudum>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: HSW, I965G i915 features: GEM/Other

Description Lakshmi 2019-06-18 06:10:08 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_310/fi-hsw-peppy/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive.html

<6> [28.627596] Console: switching to colour dummy device 80x25
<6> [28.627716] [IGT] gem_persistent_relocs: executing
<6> [28.666769] [IGT] gem_persistent_relocs: starting subtest forked-interruptible-faulting-reloc-thrash-inactive
<5> [28.711581] Setting dangerous option prefault_disable - tainting kernel
<5> [28.717996] Setting dangerous option prefault_disable - tainting kernel
<4> [31.135533] ------------[ cut here ]------------
<4> [31.135920] list_del corruption, ffff90da25aaa1b0->next is LIST_POISON1 (dead000000000100)
<4> [31.135964] WARNING: CPU: 0 PID: 901 at lib/list_debug.c:47 __list_del_entry_valid+0x4e/0x90
<4> [31.135969] Modules linked in: snd_hda_codec_hdmi x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i915 lpc_ich snd_hda_codec_realtek snd_hda_codec_generic cdc_ether snd_hda_intel usbnet snd_hda_codec r8152 snd_hwdep mii snd_hda_core snd_pcm mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc
<4> [31.136024] CPU: 0 PID: 901 Comm: gem_persistent_ Tainted: G     U            5.2.0-rc4-g39e3d39be374-drmtip_310+ #1
<4> [31.136029] Hardware name: GOOGLE Peppy/Peppy, BIOS MrChromebox 02/04/2018
<4> [31.136038] RIP: 0010:__list_del_entry_valid+0x4e/0x90
<4> [31.136049] Code: 2e 48 8b 32 48 39 fe 75 3a 48 8b 50 08 48 39 f2 75 48 b8 01 00 00 00 c3 48 89 fe 48 89 c2 48 c7 c7 b8 32 0a ab e8 f2 af bd ff <0f> 0b 31 c0 c3 48 89 fe 48 c7 c7 f0 32 0a ab e8 de af bd ff 0f 0b
<4> [31.136056] RSP: 0018:ffffb0bf80cb7c98 EFLAGS: 00010082
Comment 3 Chris Wilson 2019-06-18 17:17:11 UTC
commit 0bd6cb6b58f7332c61cef2e4ae48db1ca9910b6b (drm-intel/for-linux-next, drm-i
ntel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jun 18 08:41:29 2019 +0100

    drm/i915: Skip shrinking already freed pages
    
    Previously, we wanted to shrink the pages of freed objects before they
    were finally RCU collected. However, by removing the struct_mutex
    serialisation around the active reference, we need to acquire an extra
    reference around the wait. Unfortunately this means that we have to skip
    objects that are waiting RCU collection.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110937
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-2-chris@chris-wilson.co.uk

Pretty sure that's the one.
Comment 4 CI Bug Log 2019-06-25 08:35:00 UTC
A CI Bug Log filter associated to this bug has been updated:

{- BWR HSW: igt@gem_persistent_relocs@forked-(interruptible-)?faulting-reloc-thrash-inactive - incomplete - list_del corruption -}
{+ BWR HSW: igt@gem_persistent_relocs@forked-(interruptible-)?faulting-reloc-thrash-inactive - incomplete - list_del corruption +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_311/fi-bwr-2160/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.