Summary: | [BAT BYT] igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b/c fails suspend autoresume | ||
---|---|---|---|
Product: | DRI | Reporter: | Tvrtko Ursulin <tvrtko.ursulin> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED DUPLICATE | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | highest | CC: | intel-gfx-bugs, rjw |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | BYT | i915 features: | power/suspend-resume |
Description
Tvrtko Ursulin
2016-03-02 13:44:53 UTC
dmesg? I think igt might be lying about this not being an i915.ko bug. In both cases sshd and java refusing to freeze: [ 502.780159] kms_pipe_crc_basic: starting subtest suspend-read-crc-pipe-B [ 502.987607] PM: Syncing filesystems ... done. [ 502.991666] Freezing user space processes ... [ 523.004394] Freezing of tasks failed after 20.002 seconds (2 tasks refusing to freeze, wq_busy=0): [ 523.004418] sshd D ffff88006ccf3d50 0 6208 5428 0x00000004 [ 523.004428] ffff88006ccf3d50 ffff88006ccda580 0000000000000000 ffff88007360a580 [ 523.004436] ffff88006ccda580 ffff88006ccf3d48 ffff88006ccf3d38 ffff88006ccf4000 [ 523.004444] 0000000000000246 ffff880036b99070 ffff88006ccda580 00000000ffffffff [ 523.004451] Call Trace: [ 523.004462] [<ffffffff817bd83a>] schedule+0x3a/0x90 [ 523.004467] [<ffffffff817bdaa3>] schedule_preempt_disabled+0x13/0x20 [ 523.004472] [<ffffffff817be6f8>] mutex_lock_nested+0x158/0x3b0 [ 523.004477] [<ffffffff811db9e9>] ? pipe_write+0x39/0x470 [ 523.004482] [<ffffffff811db9e9>] pipe_write+0x39/0x470 [ 523.004487] [<ffffffff811d26d5>] __vfs_write+0xa5/0xe0 [ 523.004491] [<ffffffff811d33b4>] vfs_write+0xa4/0x190 [ 523.004496] [<ffffffff811f1b2a>] ? __fget_light+0x6a/0x90 [ 523.004500] [<ffffffff811d4254>] SyS_write+0x44/0xb0 [ 523.004506] [<ffffffff817c2e9b>] entry_SYSCALL_64_fastpath+0x16/0x73 [ 523.004512] java R running task 0 6249 6233 0x00000004 [ 523.004520] 0000000000000020 0000000000000002 ffff88007360a580 ffff88007360a5d0 [ 523.004528] 0000000000000001 0000000000000002 ffff88007360add0 ffff88007360a580 [ 523.004535] 0000000000000005 0000000000000081 ffff880073617b80 0000000000000292 [ 523.004543] Call Trace: [ 523.004549] [<ffffffff81191e79>] ? handle_mm_fault+0x609/0x1b40 [ 523.004555] [<ffffffff810cd09b>] ? lock_acquire+0xdb/0x1f0 [ 523.004560] [<ffffffff810c798d>] ? lock_is_held+0x2d/0x50 [ 523.004565] [<ffffffff817c2226>] ? _raw_spin_lock+0x36/0x50 [ 523.004570] [<ffffffff81191956>] ? handle_mm_fault+0xe6/0x1b40 [ 523.004574] [<ffffffff811918b8>] ? handle_mm_fault+0x48/0x1b40 [ 523.004578] [<ffffffff810cd09b>] ? lock_acquire+0xdb/0x1f0 [ 523.004583] [<ffffffff810ca1f1>] ? mark_held_locks+0x71/0x90 [ 523.004589] [<ffffffff810482f6>] ? __do_page_fault+0x176/0x480 [ 523.004595] [<ffffffff81000ff7>] ? trace_hardirqs_on_thunk+0x17/0x19 [ 523.004600] [<ffffffff8104860c>] ? do_page_fault+0xc/0x10 [ 523.004604] [<ffffffff817c4bf2>] ? page_fault+0x22/0x30 [ 523.004611] [<ffffffff8140fdfc>] ? copy_page_to_iter+0x10c/0x420 [ 523.004616] [<ffffffff811dbeee>] ? pipe_read+0xce/0x340 [ 523.004620] [<ffffffff811d2602>] ? __vfs_read+0xa2/0xd0 [ 523.004625] [<ffffffff811d3261>] ? vfs_read+0x81/0x130 [ 523.004629] [<ffffffff811d41a4>] ? SyS_read+0x44/0xb0 [ 523.004634] [<ffffffff817c2e9b>] ? entry_SYSCALL_64_fastpath+0x16/0x73 [ 523.004646] Restarting tasks ... done. [ 523.027674] kms_pipe_crc_basic: exiting, ret=99 [ 451.530492] kms_pipe_crc_basic: starting subtest suspend-read-crc-pipe-C [ 451.671555] PM: Syncing filesystems ... done. [ 451.676505] Freezing user space processes ... [ 471.693064] Freezing of tasks failed after 20.006 seconds (2 tasks refusing to freeze, wq_busy=0): [ 471.693087] sshd D ffff88006ccf3d50 0 6208 5428 0x00000004 [ 471.693097] ffff88006ccf3d50 ffff88006ccda580 0000000000000000 ffff88007360a580 [ 471.693105] ffff88006ccda580 ffff88006ccf3d48 ffff88006ccf3d38 ffff88006ccf4000 [ 471.693112] 0000000000000246 ffff880036b99070 ffff88006ccda580 00000000ffffffff [ 471.693120] Call Trace: [ 471.693131] [<ffffffff817bd83a>] schedule+0x3a/0x90 [ 471.693136] [<ffffffff817bdaa3>] schedule_preempt_disabled+0x13/0x20 [ 471.693141] [<ffffffff817be6f8>] mutex_lock_nested+0x158/0x3b0 [ 471.693146] [<ffffffff811db9e9>] ? pipe_write+0x39/0x470 [ 471.693150] [<ffffffff811db9e9>] pipe_write+0x39/0x470 [ 471.693155] [<ffffffff811d26d5>] __vfs_write+0xa5/0xe0 [ 471.693160] [<ffffffff811d33b4>] vfs_write+0xa4/0x190 [ 471.693164] [<ffffffff811f1b2a>] ? __fget_light+0x6a/0x90 [ 471.693169] [<ffffffff811d4254>] SyS_write+0x44/0xb0 [ 471.693174] [<ffffffff817c2e9b>] entry_SYSCALL_64_fastpath+0x16/0x73 [ 471.693180] java R running task 0 6249 6233 0x00000004 [ 471.693188] ffffffff817c39ef 0000000000000020 0000000000000002 ffff880073617ad8 [ 471.693196] ffffffff810ca334 0000000000000282 0000000000000001 ffff88007360add0 [ 471.693204] ffff88007360a580 0000000000000000 0000000000000081 0000000000000001 [ 471.693211] Call Trace: [ 471.693216] [<ffffffff817c39ef>] ? retint_kernel+0x2d/0x2d [ 471.693222] [<ffffffff810ca334>] ? trace_hardirqs_on_caller+0x124/0x1b0 [ 471.693228] [<ffffffff81191e79>] ? handle_mm_fault+0x609/0x1b40 [ 471.693234] [<ffffffff810e0725>] ? rcu_read_lock_held+0x55/0x60 [ 471.693239] [<ffffffff81191e79>] ? handle_mm_fault+0x609/0x1b40 [ 471.693243] [<ffffffff811918b8>] ? handle_mm_fault+0x48/0x1b40 [ 471.693247] [<ffffffff810cd09b>] ? lock_acquire+0xdb/0x1f0 [ 471.693252] [<ffffffff810ca1f1>] ? mark_held_locks+0x71/0x90 [ 471.693258] [<ffffffff810482f6>] ? __do_page_fault+0x176/0x480 [ 471.693263] [<ffffffff8104860c>] ? do_page_fault+0xc/0x10 [ 471.693267] [<ffffffff817c39ef>] ? retint_kernel+0x2d/0x2d [ 471.693274] [<ffffffff8140fdfc>] ? copy_page_to_iter+0x10c/0x420 [ 471.693279] [<ffffffff811dbeee>] ? pipe_read+0xce/0x340 [ 471.693283] [<ffffffff811d2602>] ? __vfs_read+0xa2/0xd0 [ 471.693287] [<ffffffff811d3261>] ? vfs_read+0x81/0x130 [ 471.693292] [<ffffffff811d41a4>] ? SyS_read+0x44/0xb0 [ 471.693296] [<ffffffff817c2e9b>] ? entry_SYSCALL_64_fastpath+0x16/0x73 [ 471.693308] Restarting tasks ... done. So the sshd side is just __pipe_lock() not being interruptible, but that leaves us with the question why did java get stuck handing the pagefault in the pipe reader? I don't suppose there is any way to get a reliable stack trace? Your guess is as good as mine. Bug in the suspend process? Something in memory management is half-in-process-of-suspend so java trying to read from stdin into its, presumably bog standard memory buffer, hangs? Could it be reading into something much more advanced than a ordinary memory buffer from stdin? My imagination fails me. Time to bring in memory management heavy weights? Or a PM expert, looks like we got one in house. Rafael, could you please glance over these traces - how does it look to you? Tvrtko, is there any reason this should be highest/critical? It doesn't look like our bug to me. Think I agree, I just filed it as such since it was found in, and was affecting CI and I thought highest/critical was the BKM was BAT/CI issues. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.