110913 – [CI][SHARDS] Igt@* - dmesg-warn - WARNING: possible circular locking dependency detected

Bug 110913 - [CI][SHARDS] Igt@* - dmesg-warn - WARNING: possible circular locking dependency detected

Summary: [CI][SHARDS] Igt@* - dmesg-warn - WARNING: possible circular locking dependen...

Status:	RESOLVED FIXED

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	DRI git
Hardware:	Other All

Importance:	high normal
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:	ReadyForDev
Keywords:

Duplicates (1):	110927 (view as bug list)
Depends on:
Blocks:

Reported:	2019-06-13 12:17 UTC by Lakshmi
Modified:	2019-07-02 11:02 UTC (History)
CC List:	2 users (show)

See Also:
i915 platform:	ALL
i915 features:	GEM/Other

Attachments

Description Lakshmi 2019-06-13 12:17:43 UTC

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6249/re-cml-u/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html

<6> [194.764268] Console: switching to colour dummy device 80x25
<6> [194.764311] [IGT] gem_persistent_relocs: executing
<6> [194.774481] [IGT] gem_persistent_relocs: starting subtest forked-faulting-reloc-thrashing
<5> [194.779443] Setting dangerous option prefault_disable - tainting kernel
<5> [194.781849] Setting dangerous option prefault_disable - tainting kernel
<5> [194.784406] Setting dangerous option prefault_disable - tainting kernel
<5> [194.788362] Setting dangerous option prefault_disable - tainting kernel
<7> [194.790251] [drm:intel_power_well_enable [i915]] enabling DC off
<7> [194.790289] [drm:gen9_set_dc_state [i915]] Setting DC state from 02 to 00
<5> [194.792327] Setting dangerous option prefault_disable - tainting kernel
<5> [194.796117] Setting dangerous option prefault_disable - tainting kernel
<5> [194.800471] Setting dangerous option prefault_disable - tainting kernel
<5> [194.802502] Setting dangerous option prefault_disable - tainting kernel
<4> [195.305572] 
<4> [195.305577] ======================================================
<4> [195.305580] WARNING: possible circular locking dependency detected
<4> [195.305584] 5.2.0-rc4-CI-CI_DRM_6249+ #1 Tainted: G     U           
<4> [195.305587] ------------------------------------------------------
<4> [195.305589] gem_persistent_/1507 is trying to acquire lock:
<4> [195.305592] 0000000016c7d56e (i915.reset){+.+.}, at: i915_request_wait+0x148/0x940 [i915]
<4> [195.305671] 
but task is already holding lock:
<4> [195.305673] 000000007f2a8000 (&dev->struct_mutex/1){+.+.}, at: shrinker_lock+0x63/0x80 [i915]
<4> [195.305707] 
which lock already depends on the new lock.

<4> [195.305710] 
the existing dependency chain (in reverse order) is:
<4> [195.305712] 
-> #4 (&dev->struct_mutex/1){+.+.}:
<4> [195.305742]        i915_gem_shrinker_taints_mutex+0x52/0xe0 [i915]
<4> [195.305771]        i915_address_space_init+0x59/0x120 [i915]
<4> [195.305799]        i915_ggtt_init_hw+0x50/0x140 [i915]
<4> [195.305821]        i915_driver_load+0xebb/0x18b0 [i915]
<4> [195.305842]        i915_pci_probe+0x3f/0x1a0 [i915]
<4> [195.305847]        pci_device_probe+0x9e/0x120
<4> [195.305850]        really_probe+0xea/0x3c0
<4> [195.305853]        driver_probe_device+0x10b/0x120
<4> [195.305855]        device_driver_attach+0x4a/0x50
<4> [195.305857]        __driver_attach+0x97/0x130
<4> [195.305860]        bus_for_each_dev+0x74/0xc0
<4> [195.305862]        bus_add_driver+0x13f/0x210
<4> [195.305864]        driver_register+0x56/0xe0
<4> [195.305867]        do_one_initcall+0x58/0x300
<4> [195.305871]        do_init_module+0x56/0x1f6
<4> [195.305873]        load_module+0x24d1/0x2990
<4> [195.305875]        __se_sys_finit_module+0xd3/0xf0
<4> [195.305878]        do_syscall_64+0x55/0x1c0
<4> [195.305881]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [195.305883] 
-> #3 (fs_reclaim){+.+.}:
<4> [195.305888]        fs_reclaim_acquire.part.24+0x24/0x30
<4> [195.305891]        kmem_cache_alloc_trace+0x2a/0x290
<4> [195.305918]        i915_gem_object_get_pages_stolen+0xa2/0x130 [i915]
<4> [195.305945]        ____i915_gem_object_get_pages+0x1d/0xa0 [i915]
<4> [195.305971]        __i915_gem_object_get_pages+0x59/0xb0 [i915]
<4> [195.305998]        _i915_gem_object_create_stolen+0xd4/0x100 [i915]
<4> [195.306024]        i915_gem_object_create_stolen+0x67/0xb0 [i915]
<4> [195.306053]        i915_gem_init+0xe5/0xac0 [i915]
<4> [195.306073]        i915_driver_load+0xdc0/0x18b0 [i915]
<4> [195.306095]        i915_pci_probe+0x3f/0x1a0 [i915]
<4> [195.306098]        pci_device_probe+0x9e/0x120
<4> [195.306100]        really_probe+0xea/0x3c0
<4> [195.306102]        driver_probe_device+0x10b/0x120
<4> [195.306105]        device_driver_attach+0x4a/0x50
<4> [195.306107]        __driver_attach+0x97/0x130
<4> [195.306109]        bus_for_each_dev+0x74/0xc0
<4> [195.306111]        bus_add_driver+0x13f/0x210
<4> [195.306113]        driver_register+0x56/0xe0
<4> [195.306116]        do_one_initcall+0x58/0x300
<4> [195.306118]        do_init_module+0x56/0x1f6
<4> [195.306120]        load_module+0x24d1/0x2990
<4> [195.306122]        __se_sys_finit_module+0xd3/0xf0
<4> [195.306125]        do_syscall_64+0x55/0x1c0
<4> [195.306127]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [195.306129] 
-> #2 (&obj->mm.lock){+.+.}:
<4> [195.306133]        __mutex_lock+0x8c/0x960
<4> [195.306158]        i915_gem_object_pin_map+0x2d/0x2a0 [i915]
<4> [195.306182]        __engine_unpark+0x42/0x80 [i915]
<4> [195.306205]        __intel_wakeref_get_first+0x40/0xa0 [i915]
<4> [195.306233]        i915_request_create+0x101/0x240 [i915]
<4> [195.306260]        i915_gem_do_execbuffer+0xbbc/0x21b0 [i915]
<4> [195.306285]        i915_gem_execbuffer2_ioctl+0x11b/0x430 [i915]
<4> [195.306289]        drm_ioctl_kernel+0x83/0xf0
<4> [195.306291]        drm_ioctl+0x2f3/0x3b0
<4> [195.306294]        do_vfs_ioctl+0xa0/0x6e0
<4> [195.306297]        ksys_ioctl+0x35/0x60
<4> [195.306299]        __x64_sys_ioctl+0x11/0x20
<4> [195.306301]        do_syscall_64+0x55/0x1c0
<4> [195.306303]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [195.306305] 
-> #1 (wakeref#2/1){+.+.}:
<4> [195.306309]        __mutex_lock+0x8c/0x960
<4> [195.306331]        __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [195.306355]        reset_prepare_engine+0x9/0x30 [i915]
<4> [195.306379]        reset_prepare+0x29/0x40 [i915]
<4> [195.306403]        i915_reset+0x9f/0x3d0 [i915]
<4> [195.306427]        i915_reset_device+0xf2/0x170 [i915]
<4> [195.306450]        i915_handle_error+0x231/0x370 [i915]
<4> [195.306473]        i915_wedged_set+0x57/0xc0 [i915]
<4> [195.306477]        simple_attr_write+0xb0/0xd0
<4> [195.306480]        full_proxy_write+0x51/0x80
<4> [195.306483]        vfs_write+0xbd/0x1b0
<4> [195.306485]        ksys_write+0x8f/0xe0
<4> [195.306487]        do_syscall_64+0x55/0x1c0
<4> [195.306490]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [195.306492] 
-> #0 (i915.reset){+.+.}:
<4> [195.306496]        lock_acquire+0xa6/0x1c0
<4> [195.306524]        i915_request_wait+0x16f/0x940 [i915]
<4> [195.306552]        i915_gem_wait_for_idle+0xf5/0x5e0 [i915]
<4> [195.306579]        i915_gem_shrink+0x4b0/0x630 [i915]
<4> [195.306605]        i915_gem_shrink_all+0x2c/0x50 [i915]
<4> [195.306629]        i915_drop_caches_set+0x1de/0x270 [i915]
<4> [195.306631]        simple_attr_write+0xb0/0xd0
<4> [195.306634]        full_proxy_write+0x51/0x80
<4> [195.306636]        vfs_write+0xbd/0x1b0
<4> [195.306639]        ksys_write+0x8f/0xe0
<4> [195.306641]        do_syscall_64+0x55/0x1c0
<4> [195.306644]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [195.306646] 
other info that might help us debug this:

<4> [195.306649] Chain exists of:
  i915.reset --> fs_reclaim --> &dev->struct_mutex/1

<4> [195.306654]  Possible unsafe locking scenario:

<4> [195.306656]        CPU0                    CPU1
<4> [195.306658]        ----                    ----
<4> [195.306659]   lock(&dev->struct_mutex/1);
<4> [195.306661]                                lock(fs_reclaim);
<4> [195.306664]                                lock(&dev->struct_mutex/1);
<4> [195.306666]   lock(i915.reset);
<4> [195.306668] 
 *** DEADLOCK ***

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6250/re-icl-u/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html

	
<6> [558.123450] Console: switching to colour dummy device 80x25
<6> [558.123486] [IGT] gem_persistent_relocs: executing
<6> [558.139070] [IGT] gem_persistent_relocs: starting subtest forked-faulting-reloc-thrashing
<5> [558.144289] Setting dangerous option prefault_disable - tainting kernel
<5> [558.146790] Setting dangerous option prefault_disable - tainting kernel
<5> [558.150358] Setting dangerous option prefault_disable - tainting kernel
<5> [558.154106] Setting dangerous option prefault_disable - tainting kernel
<5> [558.157928] Setting dangerous option prefault_disable - tainting kernel
<5> [558.161480] Setting dangerous option prefault_disable - tainting kernel
<5> [558.164165] Setting dangerous option prefault_disable - tainting kernel
<5> [558.166581] Setting dangerous option prefault_disable - tainting kernel
<4> [558.932217] 
<4> [558.932222] ======================================================
<4> [558.932224] WARNING: possible circular locking dependency detected
<4> [558.932227] 5.2.0-rc4-CI-CI_DRM_6250+ #1 Tainted: G     U           
<4> [558.932229] ------------------------------------------------------
<4> [558.932231] gem_persistent_/4166 is trying to acquire lock:
<4> [558.932234] 00000000c23189ff (i915.reset){+.+.}, at: i915_request_wait+0x148/0x940 [i915]
<4> [558.932289] 
but task is already holding lock:
<4> [558.932292] 0000000042388a8d (&dev->struct_mutex/1){+.+.}, at: shrinker_lock+0x63/0x80 [i915]
<4> [558.932324] 
which lock already depends on the new lock.

<4> [558.932326] 
the existing dependency chain (in reverse order) is:
<4> [558.932329] 
-> #4 (&dev->struct_mutex/1){+.+.}:
<4> [558.932358]        i915_gem_shrinker_taints_mutex+0x52/0xe0 [i915]
<4> [558.932387]        i915_address_space_init+0x59/0x120 [i915]
<4> [558.932415]        i915_ggtt_init_hw+0x50/0x140 [i915]
<4> [558.932435]        i915_driver_load+0xebb/0x18b0 [i915]
<4> [558.932456]        i915_pci_probe+0x3f/0x1a0 [i915]
<4> [558.932460]        pci_device_probe+0x9e/0x120
<4> [558.932463]        really_probe+0xea/0x3c0
<4> [558.932466]        driver_probe_device+0x10b/0x120
<4> [558.932468]        device_driver_attach+0x4a/0x50
<4> [558.932470]        __driver_attach+0x97/0x130
<4> [558.932472]        bus_for_each_dev+0x74/0xc0
<4> [558.932475]        bus_add_driver+0x13f/0x210
<4> [558.932477]        driver_register+0x56/0xe0
<4> [558.932480]        do_one_initcall+0x58/0x300
<4> [558.932483]        do_init_module+0x56/0x1f6
<4> [558.932486]        load_module+0x24d1/0x2990
<4> [558.932488]        __se_sys_finit_module+0xd3/0xf0
<4> [558.932490]        do_syscall_64+0x55/0x1c0
<4> [558.932494]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [558.932496] 
-> #3 (fs_reclaim){+.+.}:
<4> [558.932500]        fs_reclaim_acquire.part.24+0x24/0x30
<4> [558.932503]        kmem_cache_alloc_trace+0x2a/0x290
<4> [558.932529]        i915_gem_object_get_pages_stolen+0xa2/0x130 [i915]
<4> [558.932555]        ____i915_gem_object_get_pages+0x1d/0xa0 [i915]
<4> [558.932580]        __i915_gem_object_get_pages+0x59/0xb0 [i915]
<4> [558.932605]        _i915_gem_object_create_stolen+0xd4/0x100 [i915]
<4> [558.932630]        i915_gem_object_create_stolen+0x67/0xb0 [i915]
<4> [558.932657]        i915_gem_init+0xe5/0xac0 [i915]
<4> [558.932677]        i915_driver_load+0xdc0/0x18b0 [i915]
<4> [558.932697]        i915_pci_probe+0x3f/0x1a0 [i915]
<4> [558.932700]        pci_device_probe+0x9e/0x120
<4> [558.932702]        really_probe+0xea/0x3c0
<4> [558.932704]        driver_probe_device+0x10b/0x120
<4> [558.932707]        device_driver_attach+0x4a/0x50
<4> [558.932709]        __driver_attach+0x97/0x130
<4> [558.932711]        bus_for_each_dev+0x74/0xc0
<4> [558.932713]        bus_add_driver+0x13f/0x210
<4> [558.932716]        driver_register+0x56/0xe0
<4> [558.932718]        do_one_initcall+0x58/0x300
<4> [558.932720]        do_init_module+0x56/0x1f6
<4> [558.932722]        load_module+0x24d1/0x2990
<4> [558.932724]        __se_sys_finit_module+0xd3/0xf0
<4> [558.932727]        do_syscall_64+0x55/0x1c0
<4> [558.932729]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [558.932731] 
-> #2 (&obj->mm.lock){+.+.}:
<4> [558.932734]        __mutex_lock+0x8c/0x960
<4> [558.932759]        i915_gem_object_pin_map+0x2d/0x2a0 [i915]
<4> [558.932792]        __engine_unpark+0x42/0x80 [i915]
<4> [558.932828]        __intel_wakeref_get_first+0x40/0xa0 [i915]
<4> [558.932867]        i915_request_create+0x101/0x240 [i915]
<4> [558.932908]        i915_gem_do_execbuffer+0xbbc/0x21b0 [i915]
<4> [558.932945]        i915_gem_execbuffer2_ioctl+0x11b/0x430 [i915]
<4> [558.932949]        drm_ioctl_kernel+0x83/0xf0
<4> [558.932952]        drm_ioctl+0x2f3/0x3b0
<4> [558.932956]        do_vfs_ioctl+0xa0/0x6e0
<4> [558.932959]        ksys_ioctl+0x35/0x60
<4> [558.932962]        __x64_sys_ioctl+0x11/0x20
<4> [558.932965]        do_syscall_64+0x55/0x1c0
<4> [558.932969]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [558.932972] 
-> #1 (wakeref#2/1){+.+.}:
<4> [558.932977]        __mutex_lock+0x8c/0x960
<4> [558.933012]        __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [558.933049]        reset_prepare_engine+0x9/0x30 [i915]
<4> [558.933085]        reset_prepare+0x29/0x40 [i915]
<4> [558.933121]        i915_reset+0x9f/0x3d0 [i915]
<4> [558.933156]        i915_reset_device+0xf2/0x170 [i915]
<4> [558.933192]        i915_handle_error+0x231/0x370 [i915]
<4> [558.933228]        i915_wedged_set+0x57/0xc0 [i915]
<4> [558.933232]        simple_attr_write+0xb0/0xd0
<4> [558.933236]        full_proxy_write+0x51/0x80
<4> [558.933240]        vfs_write+0xbd/0x1b0
<4> [558.933243]        ksys_write+0x8f/0xe0
<4> [558.933246]        do_syscall_64+0x55/0x1c0
<4> [558.933250]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [558.933253] 
-> #0 (i915.reset){+.+.}:
<4> [558.933258]        lock_acquire+0xa6/0x1c0
<4> [558.933296]        i915_request_wait+0x16f/0x940 [i915]
<4> [558.933334]        i915_gem_wait_for_idle+0xf5/0x5e0 [i915]
<4> [558.933371]        i915_gem_shrink+0x4b0/0x630 [i915]
<4> [558.933407]        i915_gem_shrink_all+0x2c/0x50 [i915]
<4> [558.933443]        i915_drop_caches_set+0x1de/0x270 [i915]
<4> [558.933446]        simple_attr_write+0xb0/0xd0
<4> [558.933450]        full_proxy_write+0x51/0x80
<4> [558.933453]        vfs_write+0xbd/0x1b0
<4> [558.933456]        ksys_write+0x8f/0xe0
<4> [558.933459]        do_syscall_64+0x55/0x1c0
<4> [558.933462]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [558.933466] 
other info that might help us debug this:

<4> [558.933470] Chain exists of:
  i915.reset --> fs_reclaim --> &dev->struct_mutex/1

<4> [558.933476]  Possible unsafe locking scenario:

<4> [558.933480]        CPU0                    CPU1
<4> [558.933482]        ----                    ----
<4> [558.933485]   lock(&dev->struct_mutex/1);
<4> [558.933488]                                lock(fs_reclaim);
<4> [558.933491]                                lock(&dev->struct_mutex/1);
<4> [558.933495]   lock(i915.reset);
<4> [558.933498] 
 *** DEADLOCK ***

Comment 1 CI Bug Log 2019-06-13 12:18:20 UTC

The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* All machines: Igt@* - dmesg-warn - WARNING: possible circular locking dependency detected
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-apl1/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-apl3/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy-gup.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-glk1/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-glk5/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy-gup.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-glk6/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-hsw5/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-hsw6/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-hsw7/igt@gem_userptr_blits@map-fixed-invalidate-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-iclb7/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy-gup.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-iclb8/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-kbl3/igt@gem_userptr_blits@map-fixed-invalidate-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-kbl6/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-kbl7/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-skl10/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-skl2/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-skl6/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy-gup.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-snb2/igt@gem_userptr_blits@map-fixed-invalidate-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-snb4/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13245/shard-snb5/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6249/re-cml-u/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6249/re-cml-u/igt@gem_userptr_blits@map-fixed-invalidate-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-apl1/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-apl2/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-apl5/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-glk3/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-glk5/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-glk7/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-glk9/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-hsw5/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-hsw6/igt@gem_userptr_blits@map-fixed-invalidate-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-hsw7/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-iclb1/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-iclb1/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-kbl1/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-kbl3/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-kbl3/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6250/re-icl-u/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6250/re-icl-u/igt@gem_userptr_blits@map-fixed-invalidate-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-kbl7/igt@gem_userptr_blits@map-fixed-invalidate-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-skl7/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-skl9/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-snb4/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13249/shard-snb5/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-apl2/igt@gem_eio@context-create.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-apl3/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy-gup.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-apl4/igt@gem_eio@execbuf.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-apl5/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-apl8/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-glk3/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-glk4/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-glk5/igt@gem_userptr_blits@map-fixed-invalidate-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-glk6/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-glk7/igt@gem_eio@context-create.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-glk9/igt@gem_eio@execbuf.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-hsw2/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-hsw5/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-hsw8/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy-gup.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-kbl1/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-kbl2/igt@gem_eio@context-create.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-kbl3/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-kbl4/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-skl2/igt@gem_userptr_blits@sync-unmap-cycles.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-skl3/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-skl5/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-skl6/igt@kms_frontbuffer_tracking@fbc-modesetfrombusy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-skl7/igt@gem_eio@context-create.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-skl9/igt@gem_eio@execbuf.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6247/shard-snb7/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy-gup.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13063/shard-apl4/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13063/shard-apl6/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy-gup.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13063/shard-skl6/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html

Comment 2 Chris Wilson 2019-06-13 12:24:48 UTC

That's a nasty cycle, and plausible with potential user impact if the gpu hangs under mempressure.

Comment 3 CI Bug Log 2019-06-13 12:58:33 UTC

A CI Bug Log filter associated to this bug has been updated:

{- All machines: Igt@* - dmesg-warn - WARNING: possible circular locking dependency detected -}
{+ All machines: Igt@* - dmesg-warn - WARNING: possible circular locking dependency detected +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6248/shard-apl7/igt@gem_eio@wait-wedge-1us.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6248/shard-iclb2/igt@gem_eio@in-flight-external.html

Comment 4 Chris Wilson 2019-06-13 14:27:19 UTC

Fwiw, this should be resolved (or replaced with an equally nasty cycle that we hopefully don't detect) by removing the struct_mutex from the shrinker as planned for vm->mutex overhaul.

Comment 5 Chris Wilson 2019-06-17 07:12:55 UTC

*** Bug 110927 has been marked as a duplicate of this bug. ***

Comment 6 Chris Wilson 2019-06-23 13:12:43 UTC

https://patchwork.freedesktop.org/series/62594/

Comment 7 Jani Nikula 2019-06-26 14:44:36 UTC

IRC discussion on this:

 16:12            j4ni   ickle: so 33df8a7697a0 ("drm/i915: Prevent lock-cycles between GPU waits and GPU resets") adds extra lockdep annotation, which uncovers an actual existing bug?
 16:12           ickle   yes
 16:12            j4ni   and in that sense, the lockdep splat is expected
 16:12           ickle   it's a valid but unlikely warning
 16:16            j4ni   looking at https://bugs.freedesktop.org/show_bug.cgi?id=110913 and a random sample of the referenced logs, I can't find a lockdep splat that does *not* contain 
                         "i915.reset"
 16:17            j4ni   i.e. where are the other bugs?
 16:18           ickle   danvet is worried that CI might be misattributing bugs in pre-merge
 16:18            j4ni   right, so that hasn't actually happened yet
 16:19            j4ni   but alas it's bound to happen

We need to adjust the CI Bug Log filter to not conflate all lockdep warnings into one, and instead identify just the new warning added by the commit referenced above.

Comment 8 Chris Wilson 2019-06-26 18:30:21 UTC

commit de5147b8ce6d51f634661d7c531385371485cec6
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jun 26 16:45:47 2019 +0100

    drm/i915: Add a wakeref getter for iff the wakeref is already active
    
    For use in the next patch, we want to acquire a wakeref without having
    to wake the device up -- i.e. only acquire the engine wakeref if the
    engine is already active.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-1-chris@chris-wilson.co.uk

commit 18398904ca9e3ddd180e2ecd45886e146b1d9d5b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jun 26 16:45:48 2019 +0100

    drm/i915: Only recover active engines
    
    If we issue a reset to a currently idle engine, leave it idle
    afterwards. This is useful to excise a linkage between reset and the
    shrinker. When waking the engine, we need to pin the default context
    image which we use for overwriting a guilty context -- if the engine is
    idle we do not need this pinned image! However, this pinning means that
    waking the engine acquires the FS_RECLAIM, and so may trigger the
    shrinker. The shrinker itself may need to wait upon the GPU to unbind
    and object and so may require services of reset; ergo we should avoid
    the engine wake up path.
    
    The danger in skipping the recovery for idle engines is that we leave the
    engine with no context defined, which may interfere with the operation of
    the power context on some older platforms. In practice, we should only
    be resetting an active GPU but it something to look out for on Ironlake
    (if memory serves).
    
    Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Imre Deak <imre.deak@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-2-chris@chris-wilson.co.uk

commit 092be382a2602067766f190a113514d469162456 (HEAD -> drm-intel-next-queued, drm-intel/for-linux-next, drm-intel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jun 26 16:45:49 2019 +0100

    drm/i915: Lift intel_engines_resume() to callers
    
    Since the reset path wants to recover the engines itself, it only wants
    to reinitialise the hardware using i915_gem_init_hw(). Pull the call to
    intel_engines_resume() to the module init/resume path so we can avoid it
    during reset.
    
    Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Imre Deak <imre.deak@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-3-chris@chris-wilson.co.uk

Comment 9 CI Bug Log 2019-07-02 11:02:02 UTC

A CI Bug Log filter associated to this bug has been updated:

{- All machines: Igt@* - dmesg-warn - WARNING: possible circular locking dependency detected -}
{+ All machines: all tests - dmesg-warn - dev_priv-&gt;gpu_error.wedge_mutex +}


  No new failures caught with the new filter

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.