On CI_DRM_3150 on cfl igt@gem_exec_suspend@basic-s3 gives lock dep error and after that link training failure: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3150/fi-cfl-s/igt@gem_exec_suspend@basic-s3.html Relation to bug 102505 where similar issues seen.
Results for igt@gem_exec_suspend@basic-s3 Result: dmesg-warn Out IGT-Version: 1.19-g2885b10f (x86_64) (Linux: 4.14.0-rc2-CI-CI_DRM_3150+ x86_64) rtcwake: wakeup from "mem" using /dev/rtc0 at Thu Sep 28 13:52:19 2017 Subtest basic-S3: SUCCESS (2.859s) Environment PIGLIT_SOURCE_DIR="/opt/igt/piglit" PIGLIT_PLATFORM="mixed_glx_egl" Command /opt/igt/libexec/intel-gpu-tools/gem_exec_suspend --run-subtest basic-S3 Dmesg [ 216.484624] Setting dangerous option reset - tainting kernel [ 217.297601] Suspending console(s) (use no_console_suspend to debug) [ 217.636166] sd 3:0:0:0: [sda] Synchronizing SCSI cache [ 217.640116] sd 3:0:0:0: [sda] Stopping disk [ 218.081622] IRQ fixup: irq 1 move in progress, old vector 49 [ 218.082967] ====================================================== [ 218.082967] WARNING: possible circular locking dependency detected [ 218.082968] 4.14.0-rc2-CI-CI_DRM_3150+ #1 Tainted: G U W [ 218.082969] ------------------------------------------------------ [ 218.082969] rtcwake/3282 is trying to acquire lock: [ 218.082970] ((complete)&st->done){+.+.}, at: [<ffffffff81909e0d>] wait_for_completion+0x1d/0x20 [ 218.082973] but task is already holding lock: [ 218.082974] (sparse_irq_lock){+.+.}, at: [<ffffffff810f22e7>] irq_lock_sparse+0x17/0x20 [ 218.082976] which lock already depends on the new lock. [ 218.082976] the existing dependency chain (in reverse order) is: [ 218.082976] -> #2 (sparse_irq_lock){+.+.}: [ 218.082979] __lock_acquire+0x1420/0x15e0 [ 218.082980] lock_acquire+0xb0/0x200 [ 218.082981] __mutex_lock+0x86/0x9b0 [ 218.082981] mutex_lock_nested+0x1b/0x20 [ 218.082982] irq_lock_sparse+0x17/0x20 [ 218.082983] irq_affinity_online_cpu+0x18/0xd0 [ 218.082984] cpuhp_invoke_callback+0xa3/0x840 [ 218.082985] cpuhp_up_callbacks+0x36/0xc0 [ 218.082986] cpuhp_thread_fun+0x142/0x150 [ 218.082987] smpboot_thread_fn+0x18a/0x280 [ 218.082988] kthread+0x152/0x190 [ 218.082989] ret_from_fork+0x27/0x40 [ 218.082989] -> #1 (cpuhp_state){+.+.}: [ 218.082991] cpuhp_thread_fun+0x54/0x150 [ 218.082991] smpboot_thread_fn+0x18a/0x280 [ 218.082992] kthread+0x152/0x190 [ 218.082993] ret_from_fork+0x27/0x40 [ 218.082993] 0xffffffffffffffff [ 218.082993] -> #0 ((complete)&st->done){+.+.}: [ 218.082995] check_prev_add+0x430/0x840 [ 218.082996] __lock_acquire+0x1420/0x15e0 [ 218.082997] lock_acquire+0xb0/0x200 [ 218.082997] wait_for_common+0x58/0x210 [ 218.082998] wait_for_completion+0x1d/0x20 [ 218.082999] takedown_cpu+0x89/0xf0 [ 218.082999] cpuhp_invoke_callback+0xa3/0x840 [ 218.083000] cpuhp_down_callbacks+0x42/0x80 [ 218.083001] _cpu_down+0xb9/0xf0 [ 218.083001] freeze_secondary_cpus+0xa3/0x390 [ 218.083002] suspend_devices_and_enter+0x2fd/0xce0 [ 218.083003] pm_suspend+0x4f0/0x9d0 [ 218.083003] state_store+0x82/0xf0 [ 218.083004] kobj_attr_store+0xf/0x20 [ 218.083006] sysfs_kf_write+0x45/0x60 [ 218.083007] kernfs_fop_write+0x124/0x1c0 [ 218.083007] __vfs_write+0x28/0x130 [ 218.083008] vfs_write+0xcb/0x1c0 [ 218.083009] SyS_write+0x49/0xb0 [ 218.083010] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 218.083010] other info that might help us debug this: [ 218.083010] Chain exists of: (complete)&st->done --> cpuhp_state --> sparse_irq_lock [ 218.083012] Possible unsafe locking scenario: [ 218.083012] CPU0 CPU1 [ 218.083012] ---- ---- [ 218.083012] lock(sparse_irq_lock); [ 218.083013] lock(cpuhp_state); [ 218.083013] lock(sparse_irq_lock); [ 218.083014] lock((complete)&st->done); [ 218.083015] *** DEADLOCK *** [ 218.083015] 8 locks held by rtcwake/3282: [ 218.083015] #0: (sb_writers#5){.+.+}, at: [<ffffffff81220701>] vfs_write+0x171/0x1c0 [ 218.083017] #1: (&of->mutex){+.+.}, at: [<ffffffff812a39d2>] kernfs_fop_write+0xf2/0x1c0 [ 218.083019] #2: (kn->count#214){.+.+}, at: [<ffffffff812a39db>] kernfs_fop_write+0xfb/0x1c0 [ 218.083021] #3: (pm_mutex){+.+.}, at: [<ffffffff810e60a9>] pm_suspend+0xa9/0x9d0 [ 218.083023] #4: (acpi_scan_lock){+.+.}, at: [<ffffffff8153b8b7>] acpi_scan_lock_acquire+0x17/0x20 [ 218.083025] #5: (cpu_add_remove_lock){+.+.}, at: [<ffffffff810811ce>] freeze_secondary_cpus+0x2e/0x390 [ 218.083027] #6: (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff810d676b>] percpu_down_write+0x2b/0x110 [ 218.083028] #7: (sparse_irq_lock){+.+.}, at: [<ffffffff810f22e7>] irq_lock_sparse+0x17/0x20 [ 218.083030] stack backtrace: [ 218.083031] CPU: 5 PID: 3282 Comm: rtcwake Tainted: G U W 4.14.0-rc2-CI-CI_DRM_3150+ #1 [ 218.083032] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 [ 218.083032] Call Trace: [ 218.083034] dump_stack+0x68/0x9f [ 218.083035] print_circular_bug+0x235/0x3c0 [ 218.083036] ? lockdep_init_map_crosslock+0x20/0x20 [ 218.083037] check_prev_add+0x430/0x840 [ 218.083039] __lock_acquire+0x1420/0x15e0 [ 218.083040] ? __lock_acquire+0x1420/0x15e0 [ 218.083041] ? lockdep_init_map_crosslock+0x20/0x20 [ 218.083042] lock_acquire+0xb0/0x200 [ 218.083042] ? wait_for_completion+0x1d/0x20 [ 218.083043] wait_for_common+0x58/0x210 [ 218.083044] ? wait_for_completion+0x1d/0x20 [ 218.083045] ? cpuhp_invoke_callback+0x840/0x840 [ 218.083046] ? stop_machine_cpuslocked+0xc1/0xd0 [ 218.083047] ? cpuhp_invoke_callback+0x840/0x840 [ 218.083048] wait_for_completion+0x1d/0x20 [ 218.083048] takedown_cpu+0x89/0xf0 [ 218.083049] ? cpuhp_complete_idle_dead+0x20/0x20 [ 218.083050] cpuhp_invoke_callback+0xa3/0x840 [ 218.083051] cpuhp_down_callbacks+0x42/0x80 [ 218.083052] _cpu_down+0xb9/0xf0 [ 218.083053] freeze_secondary_cpus+0xa3/0x390 [ 218.083054] suspend_devices_and_enter+0x2fd/0xce0 [ 218.083055] pm_suspend+0x4f0/0x9d0 [ 218.083056] state_store+0x82/0xf0 [ 218.083057] kobj_attr_store+0xf/0x20 [ 218.083058] sysfs_kf_write+0x45/0x60 [ 218.083059] kernfs_fop_write+0x124/0x1c0 [ 218.083060] __vfs_write+0x28/0x130 [ 218.083061] ? rcu_read_lock_sched_held+0x7a/0x90 [ 218.083062] ? rcu_sync_lockdep_assert+0x2f/0x60 [ 218.083063] ? __sb_start_write+0x108/0x200 [ 218.083063] vfs_write+0xcb/0x1c0 [ 218.083064] SyS_write+0x49/0xb0 [ 218.083065] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 218.083066] RIP: 0033:0x7f1827f6d8f0 [ 218.083067] RSP: 002b:00007ffca6397f18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 218.083067] RAX: ffffffffffffffda RBX: ffffffff81493023 RCX: 00007f1827f6d8f0 [ 218.083068] RDX: 0000000000000004 RSI: 000055b4720e4060 RDI: 0000000000000007 [ 218.083068] RBP: ffffc9000155bf88 R08: 000055b4720e1dc0 R09: 00007f1828441700 [ 218.083069] R10: 00007f1828236b58 R11: 0000000000000246 R12: 000055b4720e1ce0 [ 218.083069] R13: 0000000000000001 R14: 0000000000000004 R15: 0000000000000004 [ 218.083071] ? __this_cpu_preempt_check+0x13/0x20 [ 218.200997] IRQ 121: no longer affine to CPU8 [ 218.233686] IRQ 123: no longer affine to CPU10 [ 218.242592] IRQ 1: no longer affine to CPU11 [ 218.242599] IRQ 8: no longer affine to CPU11 [ 218.242602] IRQ 9: no longer affine to CPU11 [ 218.242610] IRQ 120: no longer affine to CPU11 [ 218.251026] cache: parent cpu1 should not be sleeping [ 218.252373] cache: parent cpu2 should not be sleeping [ 218.253665] cache: parent cpu3 should not be sleeping [ 218.255003] cache: parent cpu4 should not be sleeping [ 218.256243] cache: parent cpu5 should not be sleeping [ 218.257730] cache: parent cpu6 should not be sleeping [ 218.259156] cache: parent cpu7 should not be sleeping [ 218.260569] cache: parent cpu8 should not be sleeping [ 218.262029] cache: parent cpu9 should not be sleeping [ 218.263492] cache: parent cpu10 should not be sleeping [ 218.265008] cache: parent cpu11 should not be sleeping [ 218.446140] sd 3:0:0:0: [sda] Starting disk [ 218.668651] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training [ 218.684828] atkbd serio0: Failed to deactivate keyboard on isa0060/serio0 [ 219.060840] atkbd serio0: Failed to enable keyboard on isa0060/serio0 [ 219.354472] Setting dangerous option reset - tainting kernel [ 219.428162] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training
The lockdep is gone from: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3159/fi-cfl-s/igt@gem_exec_suspend@basic-s3.html i.e. first test with 4.14.0-rc3. However, the link training issue remain. [ 242.461748] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training [ 242.476990] atkbd serio0: Failed to deactivate keyboard on isa0060/serio0 [ 242.853071] atkbd serio0: Failed to enable keyboard on isa0060/serio0 [ 243.461670] Setting dangerous option reset - tainting kernel [ 243.546478] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training
*** This bug has been marked as a duplicate of bug 103186 ***
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.