Bug 104693 - [BAT][SKL GUC only] igt@debugfs_test@read_all_entries - dmesg-warn - WARNING: possible circular locking dependency detected
Summary: [BAT][SKL GUC only] igt@debugfs_test@read_all_entries - dmesg-warn - WARNING:...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-19 07:05 UTC by Marta Löfstedt
Modified: 2018-03-02 16:06 UTC (History)
2 users (show)

See Also:
i915 platform: SKL
i915 features: firmware/guc


Attachments

Description Marta Löfstedt 2018-01-19 07:05:21 UTC
Starting at CI_DRM_3648 on the SKL GUC machine, lockdep:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3648/fi-skl-guc/igt@debugfs_test@read_all_entries.html

[   30.062893] ======================================================
[   30.062894] WARNING: possible circular locking dependency detected
[   30.062895] 4.15.0-rc8-CI-CI_DRM_3648+ #1 Tainted: G     U          
[   30.062896] ------------------------------------------------------
[   30.062897] debugfs_test/1268 is trying to acquire lock:
[   30.062898]  (&dev->struct_mutex){+.+.}, at: [<00000000e4213449>] i915_mutex_lock_interruptible+0x47/0x130 [i915]
[   30.062921] 
               but task is already holding lock:
[   30.062921]  (&mm->mmap_sem){++++}, at: [<00000000dd7adc93>] __do_page_fault+0x106/0x560
[   30.062924] 
               which lock already depends on the new lock.

[   30.062925] 
               the existing dependency chain (in reverse order) is:
[   30.062926] 
               -> #3 (&mm->mmap_sem){++++}:
[   30.062930]        _copy_to_user+0x1e/0x70
[   30.062932]        filldir+0x8c/0xf0
[   30.062934]        dcache_readdir+0xeb/0x160
[   30.062935]        iterate_dir+0xdc/0x140
[   30.062936]        SyS_getdents+0xa0/0x130
[   30.062938]        entry_SYSCALL_64_fastpath+0x22/0x8f
[   30.062939] 
               -> #2 (&sb->s_type->i_mutex_key#3){++++}:
[   30.062942]        start_creating+0x59/0x110
[   30.062944]        __debugfs_create_file+0x2e/0xe0
[   30.062946]        relay_create_buf_file+0x62/0x80
[   30.062947]        relay_late_setup_files+0x84/0x250
[   30.062967]        guc_log_late_setup+0x52/0x110 [i915]
[   30.062985]        i915_guc_log_register+0x3a/0x60 [i915]
[   30.062997]        i915_driver_load+0x7b6/0x1720 [i915]
[   30.063014]        i915_pci_probe+0x2e/0x90 [i915]
[   30.063017]        pci_device_probe+0x9c/0x120
[   30.063018]        driver_probe_device+0x2a3/0x480
[   30.063020]        __driver_attach+0xd9/0xe0
[   30.063021]        bus_for_each_dev+0x57/0x90
[   30.063022]        bus_add_driver+0x168/0x260
[   30.063023]        driver_register+0x52/0xc0
[   30.063024]        do_one_initcall+0x39/0x150
[   30.063025]        do_init_module+0x56/0x1ef
[   30.063026]        load_module+0x231c/0x2d70
[   30.063027]        SyS_finit_module+0xa5/0xe0
[   30.063028]        do_syscall_64+0x59/0x1a0
[   30.063030]        return_from_SYSCALL_64+0x0/0x75
[   30.063030] 
               -> #1 (relay_channels_mutex){+.+.}:
[   30.063034]        relay_open+0x12c/0x2b0
[   30.063051]        guc_log_runtime_create+0xa0/0x220 [i915]
[   30.063067]        intel_guc_log_create+0xec/0x1c0 [i915]
[   30.063083]        intel_guc_init+0x5d/0x100 [i915]
[   30.063100]        intel_uc_init+0x29/0xa0 [i915]
[   30.063116]        i915_gem_init+0x18a/0x540 [i915]
[   30.063128]        i915_driver_load+0xaa9/0x1720 [i915]
[   30.063140]        i915_pci_probe+0x2e/0x90 [i915]
[   30.063141]        pci_device_probe+0x9c/0x120
[   30.063143]        driver_probe_device+0x2a3/0x480
[   30.063144]        __driver_attach+0xd9/0xe0
[   30.063145]        bus_for_each_dev+0x57/0x90
[   30.063146]        bus_add_driver+0x168/0x260
[   30.063147]        driver_register+0x52/0xc0
[   30.063148]        do_one_initcall+0x39/0x150
[   30.063149]        do_init_module+0x56/0x1ef
[   30.063150]        load_module+0x231c/0x2d70
[   30.063151]        SyS_finit_module+0xa5/0xe0
[   30.063152]        do_syscall_64+0x59/0x1a0
[   30.063153]        return_from_SYSCALL_64+0x0/0x75
[   30.063154] 
               -> #0 (&dev->struct_mutex){+.+.}:
[   30.063156]        __mutex_lock+0x81/0x9b0
[   30.063172]        i915_mutex_lock_interruptible+0x47/0x130 [i915]
[   30.063187]        i915_gem_fault+0x201/0x790 [i915]
[   30.063190]        __do_fault+0x15/0x70
[   30.063191]        __handle_mm_fault+0x677/0xdc0
[   30.063193]        handle_mm_fault+0x14f/0x2f0
[   30.063194]        __do_page_fault+0x2d1/0x560
[   30.063195]        page_fault+0x4c/0x60
[   30.063196] 
               other info that might help us debug this:

[   30.063197] Chain exists of:
                 &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm->mmap_sem

[   30.063200]  Possible unsafe locking scenario:

[   30.063201]        CPU0                    CPU1
[   30.063201]        ----                    ----
[   30.063202]   lock(&mm->mmap_sem);
[   30.063203]                                lock(&sb->s_type->i_mutex_key#3);
[   30.063205]                                lock(&mm->mmap_sem);
[   30.063206]   lock(&dev->struct_mutex);
[   30.063207] 
                *** DEADLOCK ***

[   30.063208] 1 lock held by debugfs_test/1268:
[   30.063209]  #0:  (&mm->mmap_sem){++++}, at: [<00000000dd7adc93>] __do_page_fault+0x106/0x560
[   30.063211] 
               stack backtrace:
[   30.063213] CPU: 4 PID: 1268 Comm: debugfs_test Tainted: G     U           4.15.0-rc8-CI-CI_DRM_3648+ #1
[   30.063214] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
[   30.063214] Call Trace:
[   30.063216]  dump_stack+0x5f/0x86
[   30.063219]  print_circular_bug.isra.18+0x1d0/0x2c0
[   30.063221]  __lock_acquire+0x14ae/0x1b60
[   30.063224]  ? lock_acquire+0xaf/0x200
[   30.063226]  lock_acquire+0xaf/0x200
[   30.063240]  ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
[   30.063242]  __mutex_lock+0x81/0x9b0
[   30.063256]  ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
[   30.063270]  ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
[   30.063285]  ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
[   30.063298]  i915_mutex_lock_interruptible+0x47/0x130 [i915]
[   30.063300]  ? __pm_runtime_resume+0x4f/0x80
[   30.063315]  i915_gem_fault+0x201/0x790 [i915]
[   30.063317]  __do_fault+0x15/0x70
[   30.063318]  ? _raw_spin_unlock+0x29/0x40
[   30.063320]  __handle_mm_fault+0x677/0xdc0
[   30.063323]  handle_mm_fault+0x14f/0x2f0
[   30.063324]  __do_page_fault+0x2d1/0x560
[   30.063326]  ? page_fault+0x36/0x60
[   30.063327]  page_fault+0x4c/0x60
[   30.063328] RIP: 0033:0x7f6675e9ae4f
[   30.063329] RSP: 002b:00007ffc3ed48868 EFLAGS: 00010283
Comment 1 Marta Löfstedt 2018-01-19 07:06:31 UTC
Note 2 GUC related patches was integrated to CI_DRM_3648


commit 35fe703c3161ac9e723c845d425814a02cfda140
Author: Michal Wajdeczko <michal.wajdeczko@intel.com>
Date:   Thu Jan 11 15:24:41 2018 +0000

    drm/i915/guc: Change values for i915_guc_log_control

commit 0ed8795353265259b3c891d0626fed5f257dc527
Author: Michal Wajdeczko <michal.wajdeczko@intel.com>
Date:   Thu Jan 11 15:24:40 2018 +0000

    drm/i915/guc: Redefine guc_log_level modparam values
Comment 2 Marta Löfstedt 2018-01-19 07:14:35 UTC
Also see bug 104694
Comment 3 Marta Löfstedt 2018-01-19 07:48:51 UTC
Also, see bug 104695
Comment 4 Chris Wilson 2018-01-19 21:23:32 UTC
commit bd724318b682587ad2f989ab8e0f7b3d4486ced5
Author: Michal Wajdeczko <michal.wajdeczko@intel.com>
Date:   Fri Jan 19 12:49:26 2018 +0000

    drm/i915/guc: Keep GuC log disabled by default
    
    It looks that GuC log functionality is not fully functional yet and
    causes issues when enabled by auto(-1) modparam on debug builds.
    For example, but not limited to:
    
    [   30.062893] ======================================================
    [   30.062894] WARNING: possible circular locking dependency detected
    [   30.062895] 4.15.0-rc8-CI-CI_DRM_3648+ #1 Tainted: G     U
    [   30.062896] ------------------------------------------------------
    [   30.062897] debugfs_test/1268 is trying to acquire lock:
    [   30.062898]  (&dev->struct_mutex){+.+.}, at: [<00000000e4213449>] i915_mutex_lock_interruptible+0x47/0x130 [i915]
    [   30.062921]
                   but task is already holding lock:
    [   30.062921]  (&mm->mmap_sem){++++}, at: [<00000000dd7adc93>] __do_page_fault+0x106/0x560
    [   30.062924]
                   which lock already depends on the new lock.
    
    References: 0ed87953532652 ("drm/i915/guc: Redefine guc_log_level modparam values")
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104693
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104694
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104695
    Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Jani Saarinen <jani.saarinen@intel.com>
    Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
    Cc: Marta Lofstedt <marta.lofstedt@intel.com>
    Cc: Michal Winiarski <michal.winiarski@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180119124926.29844-1-michal.wajdeczko@intel.com
    Reviewed-by: Michal Winiarski <michal.winiarski@intel.com>
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 5 Marta Löfstedt 2018-01-22 09:16:17 UTC
This fix is hiding the issue since CI_DRM_3662. I can't close the bug, since I know that the issue is still there. However, I will archive from cibuglog.
Comment 6 Sagar Kamble 2018-01-25 00:22:54 UTC
Hi Marta,

Fix has been merged for this today
https://patchwork.freedesktop.org/patch/200407/

CI BAT also confirmed resolution with GuC log enabled by default.
But change to enable GuC log by default is gated by https://bugs.freedesktop.org/show_bug.cgi?id=104694 and will be addressed next.

Can we mark this bug as CLOSED now.

Thanks,
Sagar


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.