Bug 100099 - [BAT] [KBL] nvme driver is using a mutex inside an rcu callback causing igt@gem_exec_suspend@basic-s4-devices to fail with dmesg-warn
Summary: [BAT] [KBL] nvme driver is using a mutex inside an rcu callback causing igt@g...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) Linux (All)
: highest critical
Assignee: Intel GFX Bugs mailing list
QA Contact: Jairo Miramontes
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-07 12:02 UTC by Jari Tahvanainen
Modified: 2017-03-30 16:32 UTC (History)
1 user (show)

See Also:
i915 platform: KBL
i915 features: GEM/Other


Attachments

Description Jari Tahvanainen 2017-03-07 12:02:21 UTC
Starting from CI_DRM_2252 igt@gem_exec_suspend@basic-s4-devices have had dmesg-warn as end state.
See https://intel-gfx-ci.01.org/CI/CI_DRM_2296/fi-kbl-7500u/igt@gem_exec_suspend@basic-s4-devices.html for the latest output and links to dmesg files. Some data copied below:

Command	
/opt/igt/tests/gem_exec_suspend --run-subtest basic-S4-devices
Dmesg	
[  245.192104] Suspending console(s) (use no_console_suspend to debug)

[  245.294768] ======================================================
[  245.294769] [ INFO: possible circular locking dependency detected ]
[  245.294771] 4.11.0-rc1-CI-CI_DRM_2296+ #1 Not tainted
[  245.294772] -------------------------------------------------------
[  245.294773] kworker/u8:1/72 is trying to acquire lock:
[  245.294774]  (rcu_preempt_state.barrier_mutex){+.+...}, at: [<ffffffff81100731>] _rcu_barrier+0x31/0x160
[  245.294782] 
               but task is already holding lock:
[  245.294783]  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa00bbc48>] i915_gem_freeze+0x18/0x30 [i915]
[  245.294827] 
               which lock already depends on the new lock.

[  245.294828] 
               the existing dependency chain (in reverse order) is:
[  245.294829] 
               -> #5 (&dev->struct_mutex){+.+.+.}:
[  245.294836]        lock_acquire+0xc9/0x220
[  245.294842]        drm_gem_object_put_unlocked+0x40/0xc0
[  245.294845]        drm_gem_mmap+0xd1/0x100
[  245.294849]        mmap_region+0x39e/0x600
[  245.294851]        do_mmap+0x44c/0x500
[  245.294853]        vm_mmap_pgoff+0x9d/0xd0
[  245.294856]        SyS_mmap_pgoff+0x182/0x220
[  245.294858]        SyS_mmap+0x16/0x20
[  245.294862]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[  245.294863] 
               -> #4 (&mm->mmap_sem){++++++}:
[  245.294868]        lock_acquire+0xc9/0x220
[  245.294871]        __might_fault+0x6b/0x90
[  245.294873]        filldir+0xb2/0x140
[  245.294876]        dcache_readdir+0xf4/0x170
[  245.294877]        iterate_dir+0x7a/0x170
[  245.294879]        SyS_getdents+0xa2/0x150
[  245.294881]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[  245.294882] 
               -> #3 (&sb->s_type->i_mutex_key#3){+++++.}:
[  245.294888]        lock_acquire+0xc9/0x220
[  245.294890]        down_write+0x3f/0x70
[  245.294894]        start_creating+0x5a/0x100
[  245.294897]        debugfs_create_dir+0xc/0xd0
[  245.294900]        blk_mq_debugfs_register+0x1c/0x70
[  245.294904]        blk_mq_register_dev+0xbd/0x120
[  245.294908]        blk_register_queue+0x4c/0x150
[  245.294910]        device_add_disk+0x23b/0x4f0
[  245.294914]        nvme_validate_ns+0x1fa/0x260
[  245.294917]        nvme_scan_work+0x188/0x300
[  245.294921]        process_one_work+0x1f4/0x6d0
[  245.294924]        worker_thread+0x49/0x4a0
[  245.294927]        kthread+0x107/0x140
[  245.294929]        ret_from_fork+0x2e/0x40
[  245.294929] 
               -> #2 (all_q_mutex){+.+.+.}:
[  245.294935]        lock_acquire+0xc9/0x220
[  245.294938]        __mutex_lock+0x6e/0x990
[  245.294941]        mutex_lock_nested+0x16/0x20
[  245.294944]        blk_mq_queue_reinit_work+0x13/0x110
[  245.294947]        blk_mq_queue_reinit_dead+0x17/0x20
[  245.294950]        cpuhp_invoke_callback+0x9c/0x8a0
[  245.294953]        cpuhp_down_callbacks+0x3d/0x80
[  245.294956]        _cpu_down+0xad/0xe0
[  245.294959]        freeze_secondary_cpus+0xb1/0x3d0
[  245.294962]        suspend_devices_and_enter+0x4a0/0xca0
[  245.294964]        pm_suspend+0x59e/0xa50
[  245.294966]        state_store+0x7d/0xf0
[  245.294969]        kobj_attr_store+0xf/0x20
[  245.294972]        sysfs_kf_write+0x40/0x50
[  245.294974]        kernfs_fop_write+0x130/0x1b0
[  245.294977]        __vfs_write+0x23/0x120
[  245.294979]        vfs_write+0xc6/0x1f0
[  245.294982]        SyS_write+0x44/0xb0
[  245.294984]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[  245.294985] 
               -> #1 (cpu_hotplug.lock){+.+.+.}:
[  245.294990]        lock_acquire+0xc9/0x220
[  245.294993]        __mutex_lock+0x6e/0x990
[  245.294996]        mutex_lock_nested+0x16/0x20
[  245.294999]        get_online_cpus+0x61/0x80
[  245.295002]        _rcu_barrier+0x9f/0x160
[  245.295004]        rcu_barrier+0x10/0x20
[  245.295008]        netdev_run_todo+0x5f/0x310
[  245.295011]        rtnl_unlock+0x9/0x10
[  245.295013]        default_device_exit_batch+0x133/0x150
[  245.295017]        ops_exit_list.isra.0+0x4d/0x60
[  245.295020]        cleanup_net+0x1d8/0x2c0
[  245.295023]        process_one_work+0x1f4/0x6d0
[  245.295026]        worker_thread+0x49/0x4a0
[  245.295028]        kthread+0x107/0x140
[  245.295030]        ret_from_fork+0x2e/0x40
[  245.295030] 
               -> #0 (rcu_preempt_state.barrier_mutex){+.+...}:
[  245.295036]        __lock_acquire+0x1b44/0x1b50
[  245.295039]        lock_acquire+0xc9/0x220
[  245.295042]        __mutex_lock+0x6e/0x990
[  245.295045]        mutex_lock_nested+0x16/0x20
[  245.295047]        _rcu_barrier+0x31/0x160
[  245.295049]        rcu_barrier+0x10/0x20
[  245.295089]        i915_gem_shrink_all+0x33/0x40 [i915]
[  245.295126]        i915_gem_freeze+0x20/0x30 [i915]
[  245.295152]        i915_pm_freeze+0x1d/0x20 [i915]
[  245.295155]        pci_pm_freeze+0x59/0xe0
[  245.295157]        dpm_run_callback+0x6f/0x330
[  245.295159]        __device_suspend+0xea/0x330
[  245.295161]        async_suspend+0x1a/0x90
[  245.295165]        async_run_entry_fn+0x34/0x160
[  245.295168]        process_one_work+0x1f4/0x6d0
[  245.295171]        worker_thread+0x49/0x4a0
[  245.295173]        kthread+0x107/0x140
[  245.295175]        ret_from_fork+0x2e/0x40
[  245.295176] 
               other info that might help us debug this:

[  245.295177] Chain exists of:
                 rcu_preempt_state.barrier_mutex --> &mm->mmap_sem --> &dev->struct_mutex

[  245.295181]  Possible unsafe locking scenario:

[  245.295182]        CPU0                    CPU1
[  245.295183]        ----                    ----
[  245.295183]   lock(&dev->struct_mutex);
[  245.295185]                                lock(&mm->mmap_sem);
[  245.295187]                                lock(&dev->struct_mutex);
[  245.295189]   lock(rcu_preempt_state.barrier_mutex);
[  245.295191] 
                *** DEADLOCK ***

[  245.295192] 4 locks held by kworker/u8:1/72:
[  245.295193]  #0:  ("events_unbound"){.+.+.+}, at: [<ffffffff8109dcae>] process_one_work+0x16e/0x6d0
[  245.295199]  #1:  ((&entry->work)){+.+.+.}, at: [<ffffffff8109dcae>] process_one_work+0x16e/0x6d0
[  245.295204]  #2:  (&dev->mutex){......}, at: [<ffffffff815d5e34>] __device_suspend+0xb4/0x330
[  245.295208]  #3:  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa00bbc48>] i915_gem_freeze+0x18/0x30 [i915]
[  245.295247] 
               stack backtrace:
[  245.295250] CPU: 2 PID: 72 Comm: kworker/u8:1 Not tainted 4.11.0-rc1-CI-CI_DRM_2296+ #1
[  245.295251] Hardware name: GIGABYTE GB-BKi7A-7500/MFLP7AP-00, BIOS F1 07/27/2016
[  245.295256] Workqueue: events_unbound async_run_entry_fn
[  245.295258] Call Trace:
[  245.295262]  dump_stack+0x67/0x92
[  245.295266]  print_circular_bug+0x1e0/0x2e0
[  245.295269]  __lock_acquire+0x1b44/0x1b50
[  245.295273]  ? trace_hardirqs_on+0xd/0x10
[  245.295277]  lock_acquire+0xc9/0x220
[  245.295280]  ? _rcu_barrier+0x31/0x160
[  245.295282]  ? _rcu_barrier+0x31/0x160
[  245.295286]  __mutex_lock+0x6e/0x990
[  245.295288]  ? _rcu_barrier+0x31/0x160
[  245.295290]  ? _rcu_barrier+0x31/0x160
[  245.295293]  ? synchronize_rcu_expedited+0x35/0xb0
[  245.295296]  ? _raw_spin_unlock_irqrestore+0x52/0x60
[  245.295300]  mutex_lock_nested+0x16/0x20
[  245.295302]  _rcu_barrier+0x31/0x160
[  245.295305]  rcu_barrier+0x10/0x20
[  245.295343]  i915_gem_shrink_all+0x33/0x40 [i915]
[  245.295378]  i915_gem_freeze+0x20/0x30 [i915]
[  245.295404]  i915_pm_freeze+0x1d/0x20 [i915]
[  245.295406]  pci_pm_freeze+0x59/0xe0
[  245.295409]  dpm_run_callback+0x6f/0x330
[  245.295411]  ? pci_pm_poweroff+0xf0/0xf0
[  245.295413]  __device_suspend+0xea/0x330
[  245.295415]  async_suspend+0x1a/0x90
[  245.295419]  async_run_entry_fn+0x34/0x160
[  245.295422]  process_one_work+0x1f4/0x6d0
[  245.295425]  ? process_one_work+0x16e/0x6d0
[  245.295429]  worker_thread+0x49/0x4a0
[  245.295432]  kthread+0x107/0x140
[  245.295435]  ? process_one_work+0x6d0/0x6d0
[  245.295437]  ? kthread_create_on_node+0x40/0x40
[  245.295440]  ret_from_fork+0x2e/0x40
[  250.434263] usb usb1: root hub lost power or was reset
[  250.434265] usb usb2: root hub lost power or was reset
[  250.434719] usb usb3: root hub lost power or was reset
[  250.434720] usb usb4: root hub lost power or was reset
Comment 1 Chris Wilson 2017-03-07 12:08:30 UTC
A quick way to break this cycle (that may or may not be liked much by upstream) would be to add unique classes for each mm.
Comment 2 Chris Wilson 2017-03-07 12:16:17 UTC
Also

[  245.294829] 
               -> #5 (&dev->struct_mutex){+.+.+.}:
[  245.294836]        lock_acquire+0xc9/0x220
[  245.294842]        drm_gem_object_put_unlocked+0x40/0xc0
[  245.294845]        drm_gem_mmap+0xd1/0x100
[  245.294849]        mmap_region+0x39e/0x600
[  245.294851]        do_mmap+0x44c/0x500
[  245.294853]        vm_mmap_pgoff+0x9d/0xd0
[  245.294856]        SyS_mmap_pgoff+0x182/0x220
[  245.294858]        SyS_mmap+0x16/0x20
[  245.294862]        entry_SYSCALL_64_fastpath+0x1c/0xb1

is a false chain. We still do take the struct_mutex underneath mm->mmap_sem via i915_gem_fault() though.
Comment 3 Chris Wilson 2017-03-29 22:12:15 UTC
Rearrange code so that we don't call rcu_barrier() from the shrinker context, for another reason, which is enough to break this cycle. Though I still consider the dependency chain within nvme to be not our bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.