https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4926/shard-iclb5/igt@kms_pipe_crc_basic@hang-read-crc-pipe-d.html <3> [938.216144] INFO: task kcompactd0:55 blocked for more than 61 seconds. <3> [938.216421] Tainted: G U 5.1.0-rc3-CI-CI_DRM_5866+ #1 <3> [938.216466] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <6> [938.216521] kcompactd0 D14040 55 2 0x80000000 <4> [938.216567] Call Trace: <4> [938.216593] ? __schedule+0x36d/0xb40 <4> [938.216624] ? wait_for_common+0x116/0x1f0 <4> [938.216654] schedule+0x2d/0x80 <4> [938.216682] schedule_timeout+0x236/0x4f0 <4> [938.216714] ? wait_for_common+0x116/0x1f0 <4> [938.216746] ? mark_held_locks+0x49/0x70 <4> [938.216775] ? _raw_spin_unlock_irq+0x24/0x50 <4> [938.216809] ? wait_for_common+0x116/0x1f0 <4> [938.216840] wait_for_common+0x13a/0x1f0 <4> [938.216872] ? wake_up_q+0x80/0x80 <4> [938.216904] __flush_work+0x22f/0x320 <4> [938.216934] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0 <4> [938.216977] ? drain_local_pages+0x30/0x30 <4> [938.217011] drain_all_pages+0x172/0x1d0 <4> [938.217044] kcompactd_do_work+0x28c/0x2d0 <4> [938.217084] ? kcompactd_do_work+0x2d0/0x2d0 <4> [938.217114] ? kcompactd+0x96/0x280 <4> [938.217174] kcompactd+0x96/0x280 <4> [938.217201] ? wait_woken+0xa0/0xa0 <4> [938.217236] ? kcompactd_do_work+0x2d0/0x2d0 <4> [938.217266] kthread+0x119/0x130 <4> [938.217292] ? kthread_park+0x80/0x80 <4> [938.217323] ret_from_fork+0x24/0x50 <4> [938.217375] Showing all locks held in the system: <4> [938.217431] 1 lock held by khungtaskd/52: <4> [938.217460] #0: 0000000024b6c7d5 (rcu_read_lock){....}, at: debug_show_all_locks+0xe/0x1a0 <4> [938.217518] 1 lock held by kcompactd0/55: <4> [938.217547] #0: 00000000ce6f3070 (pcpu_drain_mutex){+.+.}, at: drain_all_pages+0x27/0x1d0 <4> [938.217605] 1 lock held by in:imklog/565: <4> [938.217633] #0: 00000000b002c24a (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x3a/0x50 <4> [938.217689] 2 locks held by kworker/3:11/1689: <4> [938.217722] 4 locks held by dmesg/2126: <4> [938.217751] <4> [938.217765] ============================================= <4> [938.217814] NMI backtrace for cpu 2 <4> [938.217841] CPU: 2 PID: 52 Comm: khungtaskd Tainted: G U 5.1.0-rc3-CI-CI_DRM_5866+ #1 <4> [938.217900] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019 <4> [938.217980] Call Trace: <4> [938.218001] dump_stack+0x67/0x9b <4> [938.218027] nmi_cpu_backtrace+0x8a/0x90 <4> [938.218058] ? lapic_can_unplug_cpu+0x90/0x90 <4> [938.218088] nmi_trigger_cpumask_backtrace+0xb7/0xf0 <4> [938.218123] watchdog+0x376/0x640 <4> [938.218154] ? hungtask_pm_notify+0x40/0x40 <4> [938.218186] kthread+0x119/0x130 <4> [938.218213] ? kthread_park+0x80/0x80 <4> [938.218244] ret_from_fork+0x24/0x50 <6> [938.218299] Sending NMI from CPU 2 to CPUs 0-1,3-7: <4> [938.218356] NMI backtrace for cpu 0 skipped: idling at acpi_processor_ffh_cstate_enter+0x64/0xb0 <4> [938.218381] NMI backtrace for cpu 5 skipped: idling at acpi_processor_ffh_cstate_enter+0x64/0xb0 <4> [938.218383] NMI backtrace for cpu 1 skipped: idling at acpi_processor_ffh_cstate_enter+0x64/0xb0 <4> [938.218388] NMI backtrace for cpu 6 <4> [938.218389] CPU: 6 PID: 2126 Comm: dmesg Tainted: G U 5.1.0-rc3-CI-CI_DRM_5866+ #1 <4> [938.218390] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019 <4> [938.218391] RIP: 0010:get_partial_node.isra.30+0x2c5/0x460 <4> [938.218392] Code: 00 00 00 4c 89 95 40 ff ff ff 48 89 8d 48 ff ff ff 48 89 95 58 ff ff ff 4c 89 8d 78 ff ff ff e8 01 f5 ea ff f0 49 0f ba 2e 00 <4c> 8b 8d 78 ff ff ff 48 8b 95 58 ff ff ff 48 8b 8d 48 ff ff ff 4c <4> [938.218393] RSP: 0018:ffffc90000c63a70 EFLAGS: 00000096 <4> [938.218394] RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000000 <4> [938.218394] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000001 <4> [938.218395] RBP: ffffc90000c63b30 R08: 0000000094d43629 R09: ffff888455a9a788 <4> [938.218396] R10: ffffea001156a688 R11: 0000000000000000 R12: 0000000000000015 <4> [938.218396] R13: ffffea00048e1c80 R14: ffffea001156a680 R15: ffff88849a337bc0 <4> [938.218397] FS: 00007efe80d36240(0000) GS:ffff88849ff80000(0000) knlGS:0000000000000000 <4> [938.218397] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [938.218398] CR2: 00007f7089d22db0 CR3: 00000004843f8006 CR4: 0000000000760ee0 <4> [938.218398] PKRU: 55555554 <4> [938.218399] Call Trace: <4> [938.218399] ? ___slab_alloc.constprop.35+0x1aa/0x380 <4> [938.218400] ___slab_alloc.constprop.35+0x1aa/0x380 <4> [938.218400] ? jbd2__journal_start+0x79/0x270 <4> [938.218401] ? find_get_entry+0x198/0x2f0 <4> [938.218401] ? jbd2__journal_start+0x79/0x270 <4> [938.218402] ? __slab_alloc.isra.28.constprop.34+0x3d/0x70 <4> [938.218402] __slab_alloc.isra.28.constprop.34+0x3d/0x70 <4> [938.218403] ? jbd2__journal_start+0x79/0x270 <4> [938.218403] kmem_cache_alloc+0x21c/0x280 <4> [938.218404] jbd2__journal_start+0x79/0x270 <4> [938.218404] ext4_da_write_begin+0x122/0x430 <4> [938.218405] generic_perform_write+0xb4/0x1c0 <4> [938.218405] __generic_file_write_iter+0xfd/0x190 <4> [938.218406] ext4_file_write_iter+0x149/0x3d0 <4> [938.218406] new_sync_write+0x104/0x160 <4> [938.218407] vfs_write+0xbd/0x1b0 <4> [938.218407] ksys_write+0x50/0xc0 <4> [938.218408] do_syscall_64+0x55/0x190 <4> [938.218408] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4> [938.218409] RIP: 0033:0x7efe80215154 <4> [938.218409] Code: 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8d 05 b1 07 2e 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 41 54 55 49 89 d4 53 48 89 f5 <4> [938.218410] RSP: 002b:00007ffee59912b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 <4> [938.218411] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007efe80215154 <4> [938.218412] RDX: 0000000000000001 RSI: 000055ed54133118 RDI: 0000000000000001 <4> [938.218412] RBP: 000055ed54133118 R08: 0000000000000001 R09: 0000000000000000 <4> [938.218413] R10: 0000000000000005 R11: 0000000000000246 R12: 00007efe804f1760 <4> [938.218413] R13: 0000000000000001 R14: 00007efe804ec760 R15: 0000000000000001 <4> [938.218415] NMI backtrace for cpu 7 <4> [938.218417] CPU: 7 PID: 565 Comm: in:imklog Tainted: G U 5.1.0-rc3-CI-CI_DRM_5866+ #1 <4> [938.218418] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019 <4> [938.218418] RIP: 0010:__lock_acquire+0x43a/0x1590 <4> [938.218419] Code: c5 06 41 31 ef 44 89 f8 45 29 fd c1 c0 08 41 31 c5 8b 44 24 18 41 01 c7 44 29 e8 41 89 c6 44 89 e8 45 01 fd c1 c0 10 41 31 c6 <44> 89 e8 44 89 f5 45 29 f7 45 01 f5 c1 cd 0d 44 31 fd 41 89 ef 29 <4> [938.218420] RSP: 0018:ffffc900005c7d20 EFLAGS: 00000006 <4> [938.218422] RAX: 000000000f47069e RBX: ffff88847c4208e0 RCX: 0000000000000000 <4> [938.218423] RDX: 000000009db434f6 RSI: ffff88847c4208b8 RDI: ffff88847c420040 <4> [938.218423] RBP: 00000000e42210c2 R08: 0000000000000001 R09: 0000000000000000 <4> [938.218424] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88847c420040 <4> [938.218425] R13: 0000000092768246 R14: 000000007e9c4a9a R15: 000000008bd872ff <4> [938.218426] FS: 00007f3067000700(0000) GS:ffff88849ffc0000(0000) knlGS:0000000000000000 <4> [938.218426] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [938.218427] CR2: 00007f4c99495628 CR3: 000000047c4c0003 CR4: 0000000000760ee0 <4> [938.218427] PKRU: 55555554 <4> [938.218428] Call Trace: <4> [938.218429] lock_acquire+0xa6/0x1c0 <4> [938.218429] ? do_syslog+0x41d/0x870 <4> [938.218430] _raw_spin_lock+0x2a/0x40 <4> [938.218431] ? do_syslog+0x41d/0x870 <4> [938.218431] do_syslog+0x41d/0x870 <4> [938.218432] ? wait_woken+0xa0/0xa0 <4> [938.218432] kmsg_read+0x39/0x50 <4> [938.218433] proc_reg_read+0x34/0x60 <4> [938.218434] vfs_read+0x9e/0x150 <4> [938.218434] ksys_read+0x50/0xc0 <4> [938.218435] do_syscall_64+0x55/0x190 <4> [938.218436] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4> [938.218436] RIP: 0033:0x7f3069eb2384 <4> [938.218437] Code: 84 00 00 00 00 00 41 54 55 49 89 d4 53 48 89 f5 89 fb 48 83 ec 10 e8 8b fc ff ff 4c 89 e2 41 89 c0 48 89 ee 89 df 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 38 44 89 c7 48 89 44 24 08 e8 c7 fc ff ff 48 <4> [938.218438] RSP: 002b:00007f3066fdf4c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 <4> [938.218440] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f3069eb2384 <4> [938.218441] RDX: 0000000000001fa0 RSI: 00007f3066fdfd00 RDI: 0000000000000005 <4> [938.218442] RBP: 00007f3066fdfd00 R08: 0000000000000000 R09: a3d70a3d70a3d70b <4> [938.218443] R10: 2ce33e6c02ce33e7 R11: 0000000000000246 R12: 0000000000001fa0 <4> [938.218443] R13: 0000000000001fa0 R14: 0000000000001f9f R15: 00007f3066fdfd2b <4> [938.218445] NMI backtrace for cpu 4 <4> [938.218447] CPU: 4 PID: 2282 Comm: kms_pipe_crc_ba Tainted: G U 5.1.0-rc3-CI-CI_DRM_5866+ #1 <4> [938.218449] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019 <4> [938.218450] RIP: 0010:dw_readl+0x36/0x50 <4> [938.218452] Code: f6 f6 c2 02 48 8d 04 31 74 1e 66 8b 38 48 8d 44 31 02 66 8b 00 0f b7 cf c1 e0 10 09 c8 83 e2 01 89 c1 0f c9 0f 45 c1 c3 8b 00 <89> c1 83 e2 01 0f c9 0f 45 c1 c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 <4> [938.218453] RSP: 0000:ffff88849ff03e88 EFLAGS: 00000046 <4> [938.218456] RAX: 00000000ffffffff RBX: 0000000000000040 RCX: ffffc9000028d000 <4> [938.218458] RDX: 0000000000000000 RSI: 0000000000000054 RDI: ffff8884999622c0 <4> [938.218459] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 <4> [938.218461] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffeff <4> [938.218462] R13: ffff88849ff03f2c R14: 0000000000000000 R15: ffff8884999622c0 <4> [938.218464] FS: 00007fa03ac3f980(0000) GS:ffff88849ff00000(0000) knlGS:0000000000000000 <4> [938.218465] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [938.218466] CR2: 00007fa03321c83c CR3: 000000048cb62002 CR4: 0000000000760ee0 <4> [938.218468] PKRU: 55555554 <4> [938.218469] Call Trace: <4> [938.218470] <IRQ> <4> [938.218471] i2c_dw_isr+0x215/0x6d0 <4> [938.218473] __handle_irq_event_percpu+0x41/0x2d0 <4> [938.218474] ? handle_irq_event+0x27/0x50 <4> [938.218475] handle_irq_event_percpu+0x2b/0x70 <4> [938.218476] handle_irq_event+0x2f/0x50 <4> [938.218478] handle_fasteoi_irq+0x8e/0x150 <4> [938.218479] handle_irq+0x67/0x160 <4> [938.218480] do_IRQ+0x5e/0x130 <4> [938.218481] common_interrupt+0xf/0xf <4> [938.218483] </IRQ> <4> [938.218484] RIP: 0033:0x7fa03aa61713 <4> [938.218486] Code: 38 4c 8d 3c 87 4c 89 64 24 70 eb 11 0f 1f 00 49 83 c7 04 41 83 e5 01 0f 85 63 07 00 00 41 8b 07 49 89 c5 48 33 04 24 48 d1 e8 <75> e3 4c 89 fb 48 2b 5c 24 38 48 c1 fb 02 41 89 dc 89 db 48 8d 04 <4> [938.218487] RSP: 002b:00007ffc3d1c88a0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd <4> [938.218490] RAX: 0000000000000000 RBX: 00007fa03ac55400 RCX: 00000000881ad35e <4> [938.218492] RDX: 00000000000007a8 RSI: 0000000000000805 RDI: 00007fa03320eff4 <4> [938.218493] RBP: 00007fa033213130 R08: 00007fa033213130 R09: 00007fa03ac80428 <4> [938.218495] R10: 00007fa03ac55400 R11: 0000000000000000 R12: 00007fa03ac495a8 <4> [938.218496] R13: 00000000881ad35e R14: 0000000000000000 R15: 00007fa033212e8c <4> [938.218498] NMI backtrace for cpu 3 <4> [938.218499] CPU: 3 PID: 1689 Comm: kworker/3:11 Tainted: G U 5.1.0-rc3-CI-CI_DRM_5866+ #1 <4> [938.218501] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019 <4> [938.218502] Workqueue: events delayed_fput <4> [938.218505] RIP: 0010:mark_lock+0x3c/0x650 <4> [938.218507] Code: 00 00 d3 e3 48 89 f5 41 89 d4 48 83 ec 08 0f b7 46 20 66 25 ff 1f 75 2a 8b 05 98 ca 5b 02 85 c0 74 7a 31 c0 48 85 58 50 74 35 <41> be 01 00 00 00 48 83 c4 08 44 89 f0 5b 5d 41 5c 41 5d 41 5e 41 <4> [938.218508] RSP: 0018:ffffc900009fb9c0 EFLAGS: 00000002 <4> [938.218511] RAX: ffffffff82c34e00 RBX: 0000000000000040 RCX: 0000000000000006 <4> [938.218512] RDX: 0000000000000006 RSI: ffff8884897488b8 RDI: ffff888489748040 <4> [938.218514] RBP: ffff8884897488b8 R08: 00000000c30f7ceb R09: 0000000000000000 <4> [938.218515] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000006 <4> [938.218516] R13: ffff8884897488b8 R14: 0000000000000000 R15: ffff888489748040 <4> [938.218518] FS: 0000000000000000(0000) GS:ffff88849fec0000(0000) knlGS:0000000000000000 <4> [938.218519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [938.218521] CR2: 00007f708b1243c0 CR3: 0000000497aae006 CR4: 0000000000760ee0 <4> [938.218522] PKRU: 55555554 <4> [938.218523] Call Trace: <4> [938.218524] mark_held_locks+0x49/0x70 <4> [938.218526] ? _raw_spin_unlock_irqrestore+0x4c/0x60 <4> [938.218527] lockdep_hardirqs_on+0x10d/0x1b0 <4> [938.218528] _raw_spin_unlock_irqrestore+0x4c/0x60 <4> [938.218530] debug_check_no_obj_freed+0x132/0x210 <4> [938.218531] free_pcp_prepare+0x77/0x160 <4> [938.218532] free_unref_page_list+0x69/0x250 <4> [938.218534] release_pages+0x2d3/0x390 <4> [938.218535] __pagevec_release+0x1f/0x30 <4> [938.218536] shmem_undo_range+0x3ad/0x890 <4> [938.218537] shmem_truncate_range+0x11/0x30 <4> [938.218539] shmem_evict_inode+0xb8/0x1a0 <4> [938.218540] ? dput+0x20/0x2c0 <4> [938.218541] evict+0xcb/0x190 <4> [938.218542] __dentry_kill+0xca/0x190 <4> [938.218544] dentry_kill+0x4b/0x1b0
This seems to be related to the known BIOS issue. Let's monitor that!
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * ICL: all tests - incomplete - NMI backtrace for cpu \d skipped: idling at acpi_processor_ffh_cstate_enter - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4926/shard-iclb5/igt@kms_pipe_crc_basic@hang-read-crc-pipe-d.html
Let's monitor the failure rate now that the following commit has landed: https://cgit.freedesktop.org/drm/drm-intel/commit/?h=topic/core-for-CI&id=b573fba52f339dc4fadef7282af4a9413fd6173d CL HACK: Disable ACPI idle drivertopic/core-for-CI There were few system hung observed while running i915_pm_rpm igt test. FDO https://bugs.freedesktop.org/show_bug.cgi?id=108840 Root cause is believed to due to page fault in ACPI idle driver. (FDO comment 18). It has been suggested by Daniel Vetter to disable ACPI idle driver for Core-for-CI, only for ICL. This hacky patch is only for ICL processor and for Core-for-CI branch. v2: Fixed compilation errors raised by lkp. commit message improvement. Cc: martin.peres@intel.com Cc: daniel.vetter@intel.com Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/1554791361-27684-1-git-send-email-anshuman.gupta@intel.com
Not seen after change in topic for CI. Monitor still..
(In reply to Martin Peres from comment #3) > Let's monitor the failure rate now that the following commit has landed: > > https://cgit.freedesktop.org/drm/drm-intel/commit/?h=topic/core-for- > CI&id=b573fba52f339dc4fadef7282af4a9413fd6173d > > CL HACK: Disable ACPI idle drivertopic/core-for-CI > There were few system hung observed while running i915_pm_rpm igt test. > FDO https://bugs.freedesktop.org/show_bug.cgi?id=108840 > Root cause is believed to due to page fault in ACPI idle driver. > (FDO comment 18). > It has been suggested by Daniel Vetter to disable ACPI idle > driver for Core-for-CI, only for ICL. > > This hacky patch is only for ICL processor and for Core-for-CI branch. > > v2: Fixed compilation errors raised by lkp. > commit message improvement. > > Cc: martin.peres@intel.com > Cc: daniel.vetter@intel.com > > Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Link: > https://patchwork.freedesktop.org/patch/msgid/1554791361-27684-1-git-send- > email-anshuman.gupta@intel.com Ugh. I think this means we've effectively disabled all C-states in CI. Last I looked intel_idle icl support wasn't upstream yet.
(In reply to Ville Syrjala from comment #5) > (In reply to Martin Peres from comment #3) > > Let's monitor the failure rate now that the following commit has landed: > > > > https://cgit.freedesktop.org/drm/drm-intel/commit/?h=topic/core-for- > > CI&id=b573fba52f339dc4fadef7282af4a9413fd6173d > > > > CL HACK: Disable ACPI idle drivertopic/core-for-CI > > There were few system hung observed while running i915_pm_rpm igt test. > > FDO https://bugs.freedesktop.org/show_bug.cgi?id=108840 > > Root cause is believed to due to page fault in ACPI idle driver. > > (FDO comment 18). > > It has been suggested by Daniel Vetter to disable ACPI idle > > driver for Core-for-CI, only for ICL. > > > > This hacky patch is only for ICL processor and for Core-for-CI branch. > > > > v2: Fixed compilation errors raised by lkp. > > commit message improvement. > > > > Cc: martin.peres@intel.com > > Cc: daniel.vetter@intel.com > > > > Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Link: > > https://patchwork.freedesktop.org/patch/msgid/1554791361-27684-1-git-send- > > email-anshuman.gupta@intel.com > > Ugh. I think this means we've effectively disabled all C-states in CI. Last > I looked intel_idle icl support wasn't upstream yet. Yeah, no C-states visible on icl-u2. So low power watermarks and whatnot not getting tested at all currently.
(In reply to Ville Syrjala from comment #6) > (In reply to Ville Syrjala from comment #5) > > (In reply to Martin Peres from comment #3) > > > Let's monitor the failure rate now that the following commit has landed: > > > > > > https://cgit.freedesktop.org/drm/drm-intel/commit/?h=topic/core-for- > > > CI&id=b573fba52f339dc4fadef7282af4a9413fd6173d > > > > > > CL HACK: Disable ACPI idle drivertopic/core-for-CI > > > There were few system hung observed while running i915_pm_rpm igt test. > > > FDO https://bugs.freedesktop.org/show_bug.cgi?id=108840 > > > Root cause is believed to due to page fault in ACPI idle driver. > > > (FDO comment 18). > > > It has been suggested by Daniel Vetter to disable ACPI idle > > > driver for Core-for-CI, only for ICL. > > > > > > This hacky patch is only for ICL processor and for Core-for-CI branch. > > > > > > v2: Fixed compilation errors raised by lkp. > > > commit message improvement. > > > > > > Cc: martin.peres@intel.com > > > Cc: daniel.vetter@intel.com > > > > > > Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > Link: > > > https://patchwork.freedesktop.org/patch/msgid/1554791361-27684-1-git-send- > > > email-anshuman.gupta@intel.com > > > > Ugh. I think this means we've effectively disabled all C-states in CI. Last > > I looked intel_idle icl support wasn't upstream yet. > > Yeah, no C-states visible on icl-u2. So low power watermarks and whatnot not > getting tested at all currently. Changing the priority to highest because of the testing gap that will potentially lead to flickering screens.
Should we actually consider revert of ICL HACK: Disable ACPI idle drivertopic/core-for-CI?
According to Martin no. So now discussing with Core team to look at fixing ACPI idle.
Patch has been taken away, waiting devs to confirm next steps here.
This issue has been seen only once also. I propose to lower priority on this.
This issue seems due to kernel task kcompactd0 is blocked for more then 61 seconds. There is cicular lock dpepedency <4>[ 938.494641] ====================================================== <4>[ 938.494641] WARNING: possible circular locking dependency detected <4>[ 938.494641] 5.1.0-rc3-CI-CI_DRM_5866+ #1 Tainted: G U <4>[ 938.494642] ------------------------------------------------------ <4>[ 938.494642] kworker/3:11/1689 is trying to acquire lock: <4>[ 938.494642] 000000008b3d0788 ((console_sem).lock){-.-.}, at: down_trylock+0xa/0x30 <4>[ 938.494643] <4>[ 938.494644] but task is already holding lock: <4>[ 938.494644] 000000002522a786 (&rq->lock){-.-.}, at: try_to_wake_up+0x1e2/0x5f0 <4>[ 938.494645] <4>[ 938.494645] which lock already depends on the new lock. <4>[ 938.494646] <4>[ 938.494646] <4>[ 938.494646] the existing dependency chain (in reverse order) is: And this test hang-read-crc-pipe checks crc after doing a hang and gpu reset (not sure what is it). Can some gem expert provide their comments whether the hang and gpu reset done by this test is causing to block kcompactd0 for more then 61 seconds.
Update: For now, this issue has occurred twice CI_DRM_6145_full (1 week, 1 day old), current CI_DRM run is 6189. IGT_4926_full (2 months old), current IGT run is 5037.
Francesco and team have not been able to reproduce issue. Dropping priority.
And not seen now 1,5 weeks either. Has seen only 2 times altogether.
Same status, seen twice and not for the last 3 weeks.
I did not manage to reproduce the issue. This bug has a reproduction rate of once every (a bit less than) 2 months... shall we lower the priority?
Agreed, setting to medium and let's see when it happens again.
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * ELK: igt@perf_pmu@cpu-hotplug - dmesg-warn - INFO: task kworker/u8:20:1058 blocked for more than 61 seconds - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_376/fi-elk-e7500/igt@perf_pmu@cpu-hotplug.html
Last happened two months ago. Given the reproduction rate of once every couple of months, we can't say this is gone, but at least setting priority to low.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/256.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.