Bug 110927 - [CI][SHARDS] igt@gem_exec_schedule@semaphore-resolve - dmesg-fail - WARNING: possible circular locking dependency detected
Summary: [CI][SHARDS] igt@gem_exec_schedule@semaphore-resolve - dmesg-fail - WARNING: ...
Status: RESOLVED DUPLICATE of bug 110913
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-06-17 05:31 UTC by Martin Peres
Modified: 2019-07-02 11:07 UTC (History)
1 user (show)

See Also:
i915 platform: ALL
i915 features: GEM/Other


Attachments

Description Martin Peres 2019-06-17 05:31:59 UTC
The test went from failing (https://bugs.freedesktop.org/show_bug.cgi?id=110519) to failing with a WARNING:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6272/shard-skl9/igt@gem_exec_schedule@semaphore-resolve.html

Starting subtest: semaphore-resolve
(gem_exec_schedule:2891) igt_aux-CRITICAL: Test assertion failure function sig_abort, file ../lib/igt_aux.c:502:
(gem_exec_schedule:2891) igt_aux-CRITICAL: Failed assertion: !"GPU hung"
Subtest semaphore-resolve failed.

<6> [759.522330] i915 0000:00:02.0: GPU HANG: ecode 9:1:0xfffffffe, in gem_exec_schedu [2891], hang on rcs0
<6> [759.522623] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
<6> [759.522645] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
<6> [759.522664] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
<6> [759.522681] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
<6> [759.522700] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<7> [759.532843] [drm:i915_reset_device [i915]] resetting chip
<5> [759.538823] i915 0000:00:02.0: Resetting chip for hang on rcs0
<4> [759.540201] 
<4> [759.540227] ======================================================
<4> [759.540259] WARNING: possible circular locking dependency detected
<4> [759.540297] 5.2.0-rc4-CI-CI_DRM_6272+ #1 Tainted: G     U           
<4> [759.540327] ------------------------------------------------------
<4> [759.540361] kworker/0:1/2700 is trying to acquire lock:
<4> [759.540392] 000000005112029f (wakeref#2/1){+.+.}, at: __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.540717] 
but task is already holding lock:
<4> [759.540748] 000000000de18043 (i915.reset){+.+.}, at: i915_reset+0x57/0x3d0 [i915]
<4> [759.541087] 
which lock already depends on the new lock.

<4> [759.541126] 
the existing dependency chain (in reverse order) is:
<4> [759.541160] 
-> #4 (i915.reset){+.+.}:
<4> [759.541551]        i915_request_wait+0x16f/0x940 [i915]
<4> [759.541922]        i915_gem_wait_for_idle+0xf5/0x570 [i915]
<4> [759.542275]        i915_gem_shrink+0x4b0/0x630 [i915]
<4> [759.542609]        i915_gem_shrink_all+0x2c/0x50 [i915]
<4> [759.542917]        i915_drop_caches_set+0x1de/0x270 [i915]
<4> [759.542958]        simple_attr_write+0xb0/0xd0
<4> [759.542996]        full_proxy_write+0x51/0x80
<4> [759.543031]        vfs_write+0xbd/0x1b0
<4> [759.543062]        ksys_write+0x8f/0xe0
<4> [759.543093]        do_syscall_64+0x55/0x1c0
<4> [759.543129]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [759.543159] 
-> #3 (&dev->struct_mutex/1){+.+.}:
<4> [759.543522]        i915_gem_shrinker_taints_mutex+0x52/0xe0 [i915]
<4> [759.543889]        i915_address_space_init+0x59/0x120 [i915]
<4> [759.544255]        i915_ggtt_init_hw+0x50/0x150 [i915]
<4> [759.544526]        i915_driver_load+0xebb/0x18b0 [i915]
<4> [759.544801]        i915_pci_probe+0x3f/0x1a0 [i915]
<4> [759.544836]        pci_device_probe+0x9e/0x120
<4> [759.544870]        really_probe+0xea/0x3c0
<4> [759.544901]        driver_probe_device+0x10b/0x120
<4> [759.544936]        device_driver_attach+0x4a/0x50
<4> [759.544970]        __driver_attach+0x97/0x130
<4> [759.545001]        bus_for_each_dev+0x74/0xc0
<4> [759.545032]        bus_add_driver+0x13f/0x210
<4> [759.545066]        driver_register+0x56/0xe0
<4> [759.545097]        do_one_initcall+0x58/0x300
<4> [759.545132]        do_init_module+0x56/0x1f6
<4> [759.545164]        load_module+0x24d1/0x2990
<4> [759.545196]        __se_sys_finit_module+0xd3/0xf0
<4> [759.545229]        do_syscall_64+0x55/0x1c0
<4> [759.545264]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [759.545293] 
-> #2 (fs_reclaim){+.+.}:
<4> [759.545344]        fs_reclaim_acquire.part.24+0x24/0x30
<4> [759.545381]        kmem_cache_alloc_trace+0x2a/0x290
<4> [759.545730]        i915_gem_object_get_pages_stolen+0xa2/0x130 [i915]
<4> [759.546084]        ____i915_gem_object_get_pages+0x1d/0xa0 [i915]
<4> [759.546436]        __i915_gem_object_get_pages+0x59/0xb0 [i915]
<4> [759.546788]        _i915_gem_object_create_stolen+0xd4/0x100 [i915]
<4> [759.547143]        i915_gem_object_create_stolen_for_preallocated+0xf0/0x550 [i915]
<4> [759.547548]        intel_alloc_initial_plane_obj.isra.119+0xc6/0x1d0 [i915]
<4> [759.547954]        intel_modeset_init+0x8b8/0x1a20 [i915]
<4> [759.548223]        i915_driver_load+0xdb1/0x18b0 [i915]
<4> [759.548497]        i915_pci_probe+0x3f/0x1a0 [i915]
<4> [759.548533]        pci_device_probe+0x9e/0x120
<4> [759.548567]        really_probe+0xea/0x3c0
<4> [759.548599]        driver_probe_device+0x10b/0x120
<4> [759.548639]        device_driver_attach+0x4a/0x50
<4> [759.548673]        __driver_attach+0x97/0x130
<4> [759.548706]        bus_for_each_dev+0x74/0xc0
<4> [759.548737]        bus_add_driver+0x13f/0x210
<4> [759.548771]        driver_register+0x56/0xe0
<4> [759.548803]        do_one_initcall+0x58/0x300
<4> [759.548836]        do_init_module+0x56/0x1f6
<4> [759.548869]        load_module+0x24d1/0x2990
<4> [759.548902]        __se_sys_finit_module+0xd3/0xf0
<4> [759.548936]        do_syscall_64+0x55/0x1c0
<4> [759.548970]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [759.549001] 
-> #1 (&obj->mm.lock){+.+.}:
<4> [759.549046]        __mutex_lock+0x8c/0x960
<4> [759.549391]        i915_gem_object_pin_map+0x2d/0x2a0 [i915]
<4> [759.549701]        __engine_unpark+0x42/0x80 [i915]
<4> [759.549992]        __intel_wakeref_get_first+0x40/0xa0 [i915]
<4> [759.550369]        i915_request_create+0x101/0x240 [i915]
<4> [759.550709]        i915_gem_do_execbuffer+0xb07/0x20f0 [i915]
<4> [759.551050]        i915_gem_execbuffer2_ioctl+0x11b/0x430 [i915]
<4> [759.551090]        drm_ioctl_kernel+0x83/0xf0
<4> [759.551123]        drm_ioctl+0x2f3/0x3b0
<4> [759.551153]        do_vfs_ioctl+0xa0/0x6e0
<4> [759.551184]        ksys_ioctl+0x35/0x60
<4> [759.551214]        __x64_sys_ioctl+0x11/0x20
<4> [759.551246]        do_syscall_64+0x55/0x1c0
<4> [759.551280]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [759.551309] 
-> #0 (wakeref#2/1){+.+.}:
<4> [759.551363]        lock_acquire+0xa6/0x1c0
<4> [759.551394]        __mutex_lock+0x8c/0x960
<4> [759.551679]        __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.551999]        reset_prepare_engine+0x9/0x30 [i915]
<4> [759.552317]        reset_prepare+0x29/0x40 [i915]
<4> [759.552636]        i915_reset+0x9f/0x3d0 [i915]
<4> [759.552953]        i915_reset_device+0xf2/0x170 [i915]
<4> [759.553272]        i915_handle_error+0x231/0x370 [i915]
<4> [759.553580]        i915_hangcheck_elapsed+0x41c/0x530 [i915]
<4> [759.553620]        process_one_work+0x245/0x610
<4> [759.553651]        worker_thread+0x37/0x380
<4> [759.553684]        kthread+0x119/0x130
<4> [759.553716]        ret_from_fork+0x3a/0x50
<4> [759.553743] 
other info that might help us debug this:

<4> [759.553784] Chain exists of:
  wakeref#2/1 --> &dev->struct_mutex/1 --> i915.reset

<4> [759.553852]  Possible unsafe locking scenario:

<4> [759.553884]        CPU0                    CPU1
<4> [759.553909]        ----                    ----
<4> [759.553934]   lock(i915.reset);
<4> [759.553959]                                lock(&dev->struct_mutex/1);
<4> [759.553996]                                lock(i915.reset);
<4> [759.554028]   lock(wakeref#2/1);
<4> [759.554059] 
 *** DEADLOCK ***

<4> [759.554095] 4 locks held by kworker/0:1/2700:
<4> [759.554120]  #0: 00000000c7971d7e ((wq_completion)events_long){+.+.}, at: process_one_work+0x1bf/0x610
<4> [759.554182]  #1: 000000003ad9146c ((work_completion)(&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: process_one_work+0x1bf/0x610
<4> [759.554252]  #2: 000000005f0d6d67 (&dev_priv->gpu_error.wedge_mutex){+.+.}, at: i915_reset_device+0xe4/0x170 [i915]
<4> [759.554599]  #3: 000000000de18043 (i915.reset){+.+.}, at: i915_reset+0x57/0x3d0 [i915]
<4> [759.554941] 
stack backtrace:
<4> [759.554979] CPU: 0 PID: 2700 Comm: kworker/0:1 Tainted: G     U            5.2.0-rc4-CI-CI_DRM_6272+ #1
<4> [759.555025] Hardware name: Google Caroline/Caroline, BIOS MrChromebox 08/27/2018
<4> [759.555345] Workqueue: events_long i915_hangcheck_elapsed [i915]
<4> [759.555381] Call Trace:
<4> [759.555413]  dump_stack+0x67/0x9b
<4> [759.555453]  print_circular_bug+0x1c8/0x2b0
<4> [759.555493]  __lock_acquire+0x1ce9/0x24c0
<4> [759.555537]  ? wake_up_klogd+0x4a/0x60
<4> [759.555570]  ? __bfs+0xe8/0x220
<4> [759.555608]  ? lock_acquire+0xa6/0x1c0
<4> [759.555643]  lock_acquire+0xa6/0x1c0
<4> [759.555931]  ? __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.555973]  __mutex_lock+0x8c/0x960
<4> [759.556257]  ? __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.556564]  ? intel_engine_stop_cs+0x1f/0xb0 [i915]
<4> [759.556853]  ? __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.556899]  ? lock_acquire+0xa6/0x1c0
<4> [759.557207]  ? intel_engine_cs_mock_selftests+0x20/0x20 [i915]
<4> [759.557497]  ? __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.557786]  __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.557828]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
<4> [759.558149]  reset_prepare_engine+0x9/0x30 [i915]
<4> [759.558470]  reset_prepare+0x29/0x40 [i915]
<4> [759.558789]  i915_reset+0x9f/0x3d0 [i915]
<4> [759.559113]  i915_reset_device+0xf2/0x170 [i915]
<4> [759.559439]  ? i915_gem_set_wedged+0x60/0x60 [i915]
<4> [759.559483]  ? queue_work_node+0x70/0x70
<4> [759.559763]  i915_handle_error+0x231/0x370 [i915]
<4> [759.559812]  ? string+0x40/0x50
<4> [759.560133]  i915_hangcheck_elapsed+0x41c/0x530 [i915]
<4> [759.560178]  ? __drm_printfn_info+0x20/0x20
<4> [759.560222]  ? __lock_acquire+0x530/0x24c0
<4> [759.560268]  ? debug_object_deactivate+0x137/0x160
<4> [759.560315]  ? lock_acquire+0xa6/0x1c0
<4> [759.560349]  ? process_one_work+0x1bf/0x610
<4> [759.560387]  process_one_work+0x245/0x610
<4> [759.560430]  worker_thread+0x37/0x380
<4> [759.560465]  ? process_one_work+0x610/0x610
<4> [759.560499]  kthread+0x119/0x130
<4> [759.560532]  ? kthread_park+0x80/0x80
<4> [759.560568]  ret_from_fork+0x3a/0x50
<7> [759.561198] [drm:i915_reset_request [i915]] client gem_exec_schedu[2891]: gained 1 ban score, now 1
Comment 1 CI Bug Log 2019-06-17 05:33:12 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* All machines: igt@gem_exec_schedule@semaphore-resolve - dmesg-fail - WARNING: possible circular locking dependency detected
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6263/shard-kbl6/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3146/shard-apl8/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3147/shard-apl6/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6268/shard-glk4/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6269/shard-apl8/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6270/shard-apl5/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6270/shard-glk1/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6271/shard-glk1/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6272/shard-glk1/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6272/shard-kbl3/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6272/shard-skl9/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13285/shard-glk1/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13286/shard-glk5/igt@gem_exec_schedule@semaphore-resolve.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13286/shard-skl3/igt@gem_exec_schedule@semaphore-resolve.html
Comment 2 Chris Wilson 2019-06-17 07:12:55 UTC

*** This bug has been marked as a duplicate of bug 110913 ***
Comment 3 CI Bug Log 2019-07-02 11:07:58 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.