Bug 110528 - [CI][DRMTIP] igt@gem_userptr_blits@coherency-sync - dmesg-warn - WARNING: possible circular locking dependency detected
Summary: [CI][DRMTIP] igt@gem_userptr_blits@coherency-sync - dmesg-warn - WARNING: pos...
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-26 08:53 UTC by Lakshmi
Modified: 2019-07-31 11:41 UTC (History)
1 user (show)

See Also:
i915 platform: PNV
i915 features: GEM/Other


Attachments

Description Lakshmi 2019-04-26 08:53:34 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_261/fi-pnv-d510/igt@gem_userptr_blits@coherency-sync.html

	
<6> [138.626463] Console: switching to colour dummy device 80x25
<6> [138.626818] [IGT] gem_userptr_blits: executing
<6> [138.696844] [IGT] gem_userptr_blits: starting subtest coherency-sync
<6> [138.698192] gem_userptr_bli (2136): drop_caches: 4
<4> [162.324467] 
<4> [162.324494] ======================================================
<4> [162.324509] WARNING: possible circular locking dependency detected
<4> [162.324527] 5.1.0-rc5-g9b6a59cae931-drmtip_261+ #1 Tainted: G     U           
<4> [162.324543] ------------------------------------------------------
<4> [162.324557] kswapd0/50 is trying to acquire lock:
<4> [162.324572] 00000000cdcc63cb (&dev->struct_mutex/1){+.+.}, at: userptr_mn_invalidate_range_start+0x173/0x270 [i915]
<4> [162.324779] 
but task is already holding lock:
<4> [162.324794] 000000005d3bddaa (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe6/0x2a0
<4> [162.324822] 
which lock already depends on the new lock.

<4> [162.324839] 
the existing dependency chain (in reverse order) is:
<4> [162.324854] 
-> #2 (&anon_vma->rwsem){++++}:
<4> [162.324875]        down_write+0x33/0x60
<4> [162.324888]        __vma_adjust+0x390/0x6c0
<4> [162.324904]        __split_vma+0x16a/0x180
<4> [162.324918]        mprotect_fixup+0x2a5/0x320
<4> [162.324932]        do_mprotect_pkey+0x208/0x2e0
<4> [162.324947]        __x64_sys_mprotect+0x16/0x20
<4> [162.324962]        do_syscall_64+0x55/0x190
<4> [162.324977]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [162.324991] 
-> #1 (&mapping->i_mmap_rwsem){++++}:
<4> [162.325012]        down_write+0x33/0x60
<4> [162.325027]        unmap_mapping_pages+0x48/0x130
<4> [162.325215]        i915_vma_revoke_mmap+0x7e/0x1c0 [i915]
<4> [162.325237]        i915_vma_unbind+0xbb/0x550 [i915]
<4> [162.325237]        i915_gem_object_unbind+0xfa/0x190 [i915]
<4> [162.325237]        i915_gem_shrink+0x2dc/0x590 [i915]
<4> [162.325237]        i915_gem_shrink_all+0x2c/0x50 [i915]
<4> [162.325237]        i915_drop_caches_set+0x1b6/0x270 [i915]
<4> [162.325237]        simple_attr_write+0xb0/0xd0
<4> [162.325237]        full_proxy_write+0x51/0x80
<4> [162.325237]        vfs_write+0xbd/0x1b0
<4> [162.325237]        ksys_write+0x55/0xe0
<4> [162.325237]        do_syscall_64+0x55/0x190
<4> [162.325237]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [162.325237] 
-> #0 (&dev->struct_mutex/1){+.+.}:
<4> [162.325237]        lock_acquire+0xa6/0x1c0
<4> [162.325237]        __mutex_lock+0x8c/0x960
<4> [162.325237]        userptr_mn_invalidate_range_start+0x173/0x270 [i915]
<4> [162.325237]        __mmu_notifier_invalidate_range_start+0x84/0x110
<4> [162.325237]        try_to_unmap_one+0x747/0x840
<4> [162.325237]        rmap_walk_anon+0x104/0x280
<4> [162.325237]        try_to_unmap+0xc0/0xf0
<4> [162.325237]        shrink_page_list+0x5ce/0xcb0
<4> [162.325237]        shrink_inactive_list+0x331/0x710
<4> [162.325237]        shrink_node_memcg+0x37b/0x770
<4> [162.325237]        shrink_node+0xc9/0x460
<4> [162.325237]        balance_pgdat+0x239/0x580
<4> [162.325237]        kswapd+0x186/0x570
<4> [162.325237]        kthread+0x119/0x130
<4> [162.325237]        ret_from_fork+0x24/0x50
<4> [162.325237] 
other info that might help us debug this:

<4> [162.325237] Chain exists of:
  &dev->struct_mutex/1 --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem

<4> [162.325237]  Possible unsafe locking scenario:

<4> [162.325237]        CPU0                    CPU1
<4> [162.325237]        ----                    ----
<4> [162.325237]   lock(&anon_vma->rwsem);
<4> [162.325237]                                lock(&mapping->i_mmap_rwsem);
<4> [162.325237]                                lock(&anon_vma->rwsem);
<4> [162.325237]   lock(&dev->struct_mutex/1);
<4> [162.325237] 
 *** DEADLOCK ***
Comment 1 CI Bug Log 2019-04-26 08:55:56 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* PNV:  igt@gem_userptr_blits@coherency-sync - dmesg-warn - WARNING: possible circular locking dependency detected
  (No new failures associated)
Comment 2 Chris Wilson 2019-04-26 09:46:08 UTC
i915_vma_revoke_mmap() removes the user GGTT mmap and so should not callback (and lock) via the userptr mmu-notifier. However, since we have used the same lockclass, lockdep thinks it might. Quick and dirty fix, give userptr it's own struct_mutex subclass. I whither at just the though of Tvrtko's scrutiny over such a hack.
Comment 3 Francesco Balestrieri 2019-07-30 04:50:17 UTC
Given that the bug was reported 3 months ago, and CI says "no occurrences", I'm closing this.

Chris, any chance that the recent changes around struct_mutex have fixed this?
Comment 4 Lakshmi 2019-07-31 11:41:31 UTC
No new failures are under this bug from the time this bug has been created, which was 3 months ago. 

Closing this bug as WORKSFORME.
Comment 5 CI Bug Log 2019-07-31 11:41:40 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.