103830 – [CI] igt@gem_userptr_blits@dmabuf-unsync - dmesg-warn - WARNING: possible circular locking dependency detecte

Bug 103830 - [CI] igt@gem_userptr_blits@dmabuf-unsync - dmesg-warn - WARNING: possible circular locking dependency detecte

Summary: [CI] igt@gem_userptr_blits@dmabuf-unsync - dmesg-warn - WARNING: possible cir...

Status:	CLOSED FIXED

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	DRI git
Hardware:	Other All

Importance:	medium normal
Assignee:	Marta Löfstedt
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:	ReadyForDev
Keywords:

Duplicates (1):	103730 (view as bug list)
Depends on:
Blocks:

Reported:	2017-11-21 07:13 UTC by Marta Löfstedt
Modified:	2017-11-23 00:16 UTC (History)
CC List:	1 user (show)

See Also:
i915 platform:	HSW
i915 features:	GEM/Other

Attachments

Description Marta Löfstedt 2017-11-21 07:13:15 UTC

https://intel-gfx-ci.01.org/tree/drm-tip/IGT_3993/shard-hsw5/igt@gem_userptr_blits@dmabuf-unsync.html

[ 5718.368156] ======================================================
[ 5718.368159] WARNING: possible circular locking dependency detected
[ 5718.368163] 4.14.0-CI-CI_DRM_3364+ #1 Tainted: G     U  W      
[ 5718.368166] ------------------------------------------------------
[ 5718.368169] gem_userptr_bli/26317 is trying to acquire lock:
[ 5718.368172]  (&mm->mmap_sem){++++}, at: [<ffffffffa0203cbf>] i915_gem_userptr_init__mmu_notifier+0x1af/0x360 [i915]
[ 5718.368227] 
               but task is already holding lock:
[ 5718.368230]  (&dev->object_name_lock){+.+.}, at: [<ffffffff815d91e2>] drm_gem_prime_handle_to_fd+0xe2/0x1b0
[ 5718.368238] 
               which lock already depends on the new lock.

[ 5718.368242] 
               the existing dependency chain (in reverse order) is:
[ 5718.368246] 
               -> #2 (&dev->object_name_lock){+.+.}:
[ 5718.368253]        __mutex_lock+0x86/0x9b0
[ 5718.368256]        drm_gem_handle_create+0x24/0x40
[ 5718.368281]        igt_ctx_exec+0x611/0xe30 [i915]
[ 5718.368313]        __i915_subtests+0x34/0xc0 [i915]
[ 5718.368343]        __run_selftests+0x11c/0x1c0 [i915]
[ 5718.368374]        i915_live_selftests+0x31/0x60 [i915]
[ 5718.368396]        i915_pci_probe+0x45/0x90 [i915]
[ 5718.368400]        pci_device_probe+0xa1/0x130
[ 5718.368404]        driver_probe_device+0x293/0x440
[ 5718.368408]        __driver_attach+0xde/0xe0
[ 5718.368411]        bus_for_each_dev+0x5c/0x90
[ 5718.368414]        bus_add_driver+0x16d/0x260
[ 5718.368418]        driver_register+0x57/0xc0
[ 5718.368421]        do_one_initcall+0x3e/0x160
[ 5718.368425]        do_init_module+0x5b/0x1fa
[ 5718.368429]        load_module+0x2374/0x2dc0
[ 5718.368432]        SyS_finit_module+0xaa/0xe0
[ 5718.368436]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 5718.368439] 
               -> #1 (&dev->struct_mutex){+.+.}:
[ 5718.368445]        __mutex_lock+0x86/0x9b0
[ 5718.368471]        i915_gem_set_caching_ioctl+0xdc/0x290 [i915]
[ 5718.368499]        i915_gem_object_get_sg+0x396/0x3c0 [i915]
[ 5718.368503]        __do_fault+0x1a/0x70
[ 5718.368506]        __handle_mm_fault+0x7c4/0xdb0
[ 5718.368509]        handle_mm_fault+0x154/0x300
[ 5718.368512]        __do_page_fault+0x2d6/0x570
[ 5718.368515]        page_fault+0x22/0x30
[ 5718.368518] 
               -> #0 (&mm->mmap_sem){++++}:
[ 5718.368525]        lock_acquire+0xaf/0x200
[ 5718.368528]        down_write+0x38/0x70
[ 5718.368557]        i915_gem_userptr_init__mmu_notifier+0x1af/0x360 [i915]
[ 5718.368582]        i915_gem_prime_export+0x6e/0x90 [i915]
[ 5718.368586]        drm_gem_prime_handle_to_fd+0x186/0x1b0
[ 5718.368589]        drm_ioctl_kernel+0x65/0xb0
[ 5718.368592]        drm_ioctl+0x295/0x340
[ 5718.368595]        do_vfs_ioctl+0x8f/0x670
[ 5718.368599]        SyS_ioctl+0x3b/0x70
[ 5718.368602]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 5718.368604] 
               other info that might help us debug this:

[ 5718.368609] Chain exists of:
                 &mm->mmap_sem --> &dev->struct_mutex --> &dev->object_name_lock

[ 5718.368616]  Possible unsafe locking scenario:

[ 5718.368620]        CPU0                    CPU1
[ 5718.368622]        ----                    ----
[ 5718.368624]   lock(&dev->object_name_lock);
[ 5718.368627]                                lock(&dev->struct_mutex);
[ 5718.368631]                                lock(&dev->object_name_lock);
[ 5718.368635]   lock(&mm->mmap_sem);
[ 5718.368637] 
                *** DEADLOCK ***

[ 5718.368642] 2 locks held by gem_userptr_bli/26317:
[ 5718.368644]  #0:  (&prime_fpriv->lock){+.+.}, at: [<ffffffff815d9135>] drm_gem_prime_handle_to_fd+0x35/0x1b0
[ 5718.368651]  #1:  (&dev->object_name_lock){+.+.}, at: [<ffffffff815d91e2>] drm_gem_prime_handle_to_fd+0xe2/0x1b0
[ 5718.368658] 
               stack backtrace:
[ 5718.368662] CPU: 7 PID: 26317 Comm: gem_userptr_bli Tainted: G     U  W       4.14.0-CI-CI_DRM_3364+ #1
[ 5718.368666] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
[ 5718.368669] Call Trace:
[ 5718.368674]  dump_stack+0x5f/0x86
[ 5718.368678]  print_circular_bug.isra.18+0x1d0/0x2c0
[ 5718.368682]  __lock_acquire+0x19c3/0x1b60
[ 5718.368687]  ? lock_acquire+0xaf/0x200
[ 5718.368690]  lock_acquire+0xaf/0x200
[ 5718.368720]  ? i915_gem_userptr_init__mmu_notifier+0x1af/0x360 [i915]
[ 5718.368724]  down_write+0x38/0x70
[ 5718.368753]  ? i915_gem_userptr_init__mmu_notifier+0x1af/0x360 [i915]
[ 5718.368782]  i915_gem_userptr_init__mmu_notifier+0x1af/0x360 [i915]
[ 5718.368808]  i915_gem_prime_export+0x6e/0x90 [i915]
[ 5718.368812]  drm_gem_prime_handle_to_fd+0x186/0x1b0
[ 5718.368816]  ? drm_prime_remove_buf_handle_locked+0x90/0x90
[ 5718.368820]  drm_ioctl_kernel+0x65/0xb0
[ 5718.368823]  drm_ioctl+0x295/0x340
[ 5718.368826]  ? drm_prime_remove_buf_handle_locked+0x90/0x90
[ 5718.368830]  ? __handle_mm_fault+0x83a/0xdb0
[ 5718.368835]  do_vfs_ioctl+0x8f/0x670
[ 5718.368838]  ? entry_SYSCALL_64_fastpath+0x5/0xb1
[ 5718.368842]  ? trace_hardirqs_on_caller+0xde/0x1c0
[ 5718.368846]  SyS_ioctl+0x3b/0x70
[ 5718.368849]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 5718.368853] RIP: 0033:0x7f29b6cfb587
[ 5718.368855] RSP: 002b:00007ffeda5a69b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 5718.368860] RAX: ffffffffffffffda RBX: ffffc900008bfff0 RCX: 00007f29b6cfb587
[ 5718.368863] RDX: 00007ffeda5a69f4 RSI: 00000000c00c642d RDI: 0000000000000005
[ 5718.368866] RBP: 0000000000000001 R08: 0000000000001000 R09: 0000000000004010
[ 5718.368870] R10: 00007f29b6fbeb58 R11: 0000000000000246 R12: 0000000000000046
[ 5718.368873] R13: 00007ffeda5a7260 R14: 0000000000000000 R15: 0000000000000000

Comment 1 Marta Löfstedt 2017-11-21 07:14:14 UTC

Note there is also bug 103730, which could be related.

Comment 2 Chris Wilson 2017-11-21 09:57:26 UTC

False chain carried over from the selftest module reloads.

Comment 3 Chris Wilson 2017-11-21 23:02:36 UTC

commit f9eb63b98c91f4cfaddf54b769f971c77da10917
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Nov 21 11:06:52 2017 +0000

    drm/i915/selftests: Avoid drm_gem_handle_create under struct_mutex
    
    Despite us reloading the module around every selftest, the lockclasses
    persist and the chains used in selftesting may then dictate how we are
    allowed to nest locks during runtime testing. As such we have to be just
    as careful, and in particular it turns out we are not allowed to nest
    dev->object_name_lock (drm_gem_handle_create) inside dev->struct_mutex.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103830
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Matthew Auld <matthew.auld@intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20171121110652.1107-1-chris@chris-wilson.co.uk
    Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Comment 4 Marta Löfstedt 2017-11-22 07:43:13 UTC

Fix included in CI_DRM_3370, need more runs before closing.

Comment 5 Marta Löfstedt 2017-11-22 07:53:14 UTC

I assign this to myself to monitor it to be able to close

Comment 6 Marta Löfstedt 2017-11-22 14:06:19 UTC

Both CI_DRM_3370 and CI_DRM_3371 are fine I will close. Thanks Chris for fixing this!

Comment 7 Chris Wilson 2017-11-23 00:16:59 UTC

*** Bug 103730 has been marked as a duplicate of this bug. ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.