Bug 94370

Summary: [BAT BYT] igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b/c fails suspend autoresume
Product: DRI Reporter: Tvrtko Ursulin <tvrtko.ursulin>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: highest CC: intel-gfx-bugs, rjw
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: BYT i915 features: power/suspend-resume

Description Tvrtko Ursulin 2016-03-02 13:44:53 UTC
/archive/results/CI_IGT_test/Patchwork_1513/

Results for igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b
Overview

Result: fail

Back to summary
Details
Detail 	Value
Returncode 	99
Time 	0:00:20.402004
Stdout 	

rtcwake: wakeup from "mem" using /dev/rtc0 at Wed Mar  2 12:44:58 2016
rtcwake: wakeup from "mem" using /dev/rtc0 at Wed Mar  2 12:44:58 2016
IGT-Version: 1.14-ga3de0bc (x86_64) (Linux: 4.5.0-rc6-gfxbench+ x86_64)
Stack trace:
  #0 [__igt_fail_assert+0x101]
  #1 [igt_system_suspend_autoresume+0x9f]
  #2 [__real_main232+0x381]
  #3 [main+0x23]
  #4 [__libc_start_main+0xf0]
  #5 [_start+0x29]
Subtest suspend-read-crc-pipe-B: FAIL (20.226s)

Stderr 	

rtcwake: write error
(kms_pipe_crc_basic:7065) igt-aux-CRITICAL: Test assertion failure function igt_system_suspend_autoresume, file igt_aux.c:381:
(kms_pipe_crc_basic:7065) igt-aux-CRITICAL: Failed assertion: ret == 0
(kms_pipe_crc_basic:7065) igt-aux-CRITICAL: This failure means that something is wrong with the rtcwake tool or how your distro is set up. This is not a i915.ko or i-g-t bug.
Subtest suspend-read-crc-pipe-B failed.
**** DEBUG ****
(kms_pipe_crc_basic:7065) igt-core-DEBUG: Test requirement passed: !igt_run_in_simulation()
(kms_pipe_crc_basic:7065) igt-aux-DEBUG: Test requirement passed: !(system("rtcwake -n -s 30 -m mem") != 0)
(kms_pipe_crc_basic:7065) igt-aux-CRITICAL: Test assertion failure function igt_system_suspend_autoresume, file igt_aux.c:381:
(kms_pipe_crc_basic:7065) igt-aux-CRITICAL: Failed assertion: ret == 0
(kms_pipe_crc_basic:7065) igt-aux-CRITICAL: This failure means that something is wrong with the rtcwake tool or how your distro is set up. This is not a i915.ko or i-g-t bug.
****  END  ****

Environment 	

PIGLIT_SOURCE_DIR="/opt/igt/piglit" PIGLIT_PLATFORM="mixed_glx_egl"

Command 	/opt/igt/tests/kms_pipe_crc_basic --run-subtest suspend-read-crc-pipe-B
dmesg
Comment 1 Chris Wilson 2016-03-02 13:56:46 UTC
dmesg? I think igt might be lying about this not being an i915.ko bug.
Comment 2 Tvrtko Ursulin 2016-03-02 14:00:53 UTC
In both cases sshd and java refusing to freeze:

[  502.780159] kms_pipe_crc_basic: starting subtest suspend-read-crc-pipe-B
[  502.987607] PM: Syncing filesystems ... done.
[  502.991666] Freezing user space processes ... 
[  523.004394] Freezing of tasks failed after 20.002 seconds (2 tasks refusing to freeze, wq_busy=0):
[  523.004418] sshd            D ffff88006ccf3d50     0  6208   5428 0x00000004
[  523.004428]  ffff88006ccf3d50 ffff88006ccda580 0000000000000000 ffff88007360a580
[  523.004436]  ffff88006ccda580 ffff88006ccf3d48 ffff88006ccf3d38 ffff88006ccf4000
[  523.004444]  0000000000000246 ffff880036b99070 ffff88006ccda580 00000000ffffffff
[  523.004451] Call Trace:
[  523.004462]  [<ffffffff817bd83a>] schedule+0x3a/0x90
[  523.004467]  [<ffffffff817bdaa3>] schedule_preempt_disabled+0x13/0x20
[  523.004472]  [<ffffffff817be6f8>] mutex_lock_nested+0x158/0x3b0
[  523.004477]  [<ffffffff811db9e9>] ? pipe_write+0x39/0x470
[  523.004482]  [<ffffffff811db9e9>] pipe_write+0x39/0x470
[  523.004487]  [<ffffffff811d26d5>] __vfs_write+0xa5/0xe0
[  523.004491]  [<ffffffff811d33b4>] vfs_write+0xa4/0x190
[  523.004496]  [<ffffffff811f1b2a>] ? __fget_light+0x6a/0x90
[  523.004500]  [<ffffffff811d4254>] SyS_write+0x44/0xb0
[  523.004506]  [<ffffffff817c2e9b>] entry_SYSCALL_64_fastpath+0x16/0x73
[  523.004512] java            R  running task        0  6249   6233 0x00000004
[  523.004520]  0000000000000020 0000000000000002 ffff88007360a580 ffff88007360a5d0
[  523.004528]  0000000000000001 0000000000000002 ffff88007360add0 ffff88007360a580
[  523.004535]  0000000000000005 0000000000000081 ffff880073617b80 0000000000000292
[  523.004543] Call Trace:
[  523.004549]  [<ffffffff81191e79>] ? handle_mm_fault+0x609/0x1b40
[  523.004555]  [<ffffffff810cd09b>] ? lock_acquire+0xdb/0x1f0
[  523.004560]  [<ffffffff810c798d>] ? lock_is_held+0x2d/0x50
[  523.004565]  [<ffffffff817c2226>] ? _raw_spin_lock+0x36/0x50
[  523.004570]  [<ffffffff81191956>] ? handle_mm_fault+0xe6/0x1b40
[  523.004574]  [<ffffffff811918b8>] ? handle_mm_fault+0x48/0x1b40
[  523.004578]  [<ffffffff810cd09b>] ? lock_acquire+0xdb/0x1f0
[  523.004583]  [<ffffffff810ca1f1>] ? mark_held_locks+0x71/0x90
[  523.004589]  [<ffffffff810482f6>] ? __do_page_fault+0x176/0x480
[  523.004595]  [<ffffffff81000ff7>] ? trace_hardirqs_on_thunk+0x17/0x19
[  523.004600]  [<ffffffff8104860c>] ? do_page_fault+0xc/0x10
[  523.004604]  [<ffffffff817c4bf2>] ? page_fault+0x22/0x30
[  523.004611]  [<ffffffff8140fdfc>] ? copy_page_to_iter+0x10c/0x420
[  523.004616]  [<ffffffff811dbeee>] ? pipe_read+0xce/0x340
[  523.004620]  [<ffffffff811d2602>] ? __vfs_read+0xa2/0xd0
[  523.004625]  [<ffffffff811d3261>] ? vfs_read+0x81/0x130
[  523.004629]  [<ffffffff811d41a4>] ? SyS_read+0x44/0xb0
[  523.004634]  [<ffffffff817c2e9b>] ? entry_SYSCALL_64_fastpath+0x16/0x73

[  523.004646] Restarting tasks ... done.
[  523.027674] kms_pipe_crc_basic: exiting, ret=99


[  451.530492] kms_pipe_crc_basic: starting subtest suspend-read-crc-pipe-C
[  451.671555] PM: Syncing filesystems ... done.
[  451.676505] Freezing user space processes ... 
[  471.693064] Freezing of tasks failed after 20.006 seconds (2 tasks refusing to freeze, wq_busy=0):
[  471.693087] sshd            D ffff88006ccf3d50     0  6208   5428 0x00000004
[  471.693097]  ffff88006ccf3d50 ffff88006ccda580 0000000000000000 ffff88007360a580
[  471.693105]  ffff88006ccda580 ffff88006ccf3d48 ffff88006ccf3d38 ffff88006ccf4000
[  471.693112]  0000000000000246 ffff880036b99070 ffff88006ccda580 00000000ffffffff
[  471.693120] Call Trace:
[  471.693131]  [<ffffffff817bd83a>] schedule+0x3a/0x90
[  471.693136]  [<ffffffff817bdaa3>] schedule_preempt_disabled+0x13/0x20
[  471.693141]  [<ffffffff817be6f8>] mutex_lock_nested+0x158/0x3b0
[  471.693146]  [<ffffffff811db9e9>] ? pipe_write+0x39/0x470
[  471.693150]  [<ffffffff811db9e9>] pipe_write+0x39/0x470
[  471.693155]  [<ffffffff811d26d5>] __vfs_write+0xa5/0xe0
[  471.693160]  [<ffffffff811d33b4>] vfs_write+0xa4/0x190
[  471.693164]  [<ffffffff811f1b2a>] ? __fget_light+0x6a/0x90
[  471.693169]  [<ffffffff811d4254>] SyS_write+0x44/0xb0
[  471.693174]  [<ffffffff817c2e9b>] entry_SYSCALL_64_fastpath+0x16/0x73
[  471.693180] java            R  running task        0  6249   6233 0x00000004
[  471.693188]  ffffffff817c39ef 0000000000000020 0000000000000002 ffff880073617ad8
[  471.693196]  ffffffff810ca334 0000000000000282 0000000000000001 ffff88007360add0
[  471.693204]  ffff88007360a580 0000000000000000 0000000000000081 0000000000000001
[  471.693211] Call Trace:
[  471.693216]  [<ffffffff817c39ef>] ? retint_kernel+0x2d/0x2d
[  471.693222]  [<ffffffff810ca334>] ? trace_hardirqs_on_caller+0x124/0x1b0
[  471.693228]  [<ffffffff81191e79>] ? handle_mm_fault+0x609/0x1b40
[  471.693234]  [<ffffffff810e0725>] ? rcu_read_lock_held+0x55/0x60
[  471.693239]  [<ffffffff81191e79>] ? handle_mm_fault+0x609/0x1b40
[  471.693243]  [<ffffffff811918b8>] ? handle_mm_fault+0x48/0x1b40
[  471.693247]  [<ffffffff810cd09b>] ? lock_acquire+0xdb/0x1f0
[  471.693252]  [<ffffffff810ca1f1>] ? mark_held_locks+0x71/0x90
[  471.693258]  [<ffffffff810482f6>] ? __do_page_fault+0x176/0x480
[  471.693263]  [<ffffffff8104860c>] ? do_page_fault+0xc/0x10
[  471.693267]  [<ffffffff817c39ef>] ? retint_kernel+0x2d/0x2d
[  471.693274]  [<ffffffff8140fdfc>] ? copy_page_to_iter+0x10c/0x420
[  471.693279]  [<ffffffff811dbeee>] ? pipe_read+0xce/0x340
[  471.693283]  [<ffffffff811d2602>] ? __vfs_read+0xa2/0xd0
[  471.693287]  [<ffffffff811d3261>] ? vfs_read+0x81/0x130
[  471.693292]  [<ffffffff811d41a4>] ? SyS_read+0x44/0xb0
[  471.693296]  [<ffffffff817c2e9b>] ? entry_SYSCALL_64_fastpath+0x16/0x73

[  471.693308] Restarting tasks ... done.
Comment 3 Chris Wilson 2016-03-02 14:35:45 UTC
So the sshd side is just __pipe_lock() not being interruptible, but that leaves us with the question why did java get stuck handing the pagefault in the pipe reader? I don't suppose there is any way to get a reliable stack trace?
Comment 4 Tvrtko Ursulin 2016-03-02 16:48:35 UTC
Your guess is as good as mine. Bug in the suspend process? Something in memory management is half-in-process-of-suspend so java trying to read from stdin into its, presumably bog standard memory buffer, hangs? Could it be reading into something much more advanced than a ordinary memory buffer from stdin? My imagination fails me. Time to bring in memory management heavy weights?
Comment 5 Tvrtko Ursulin 2016-03-02 16:53:45 UTC
Or a PM expert, looks like we got one in house. Rafael, could you please glance over these traces - how does it look to you?
Comment 6 Jani Nikula 2016-04-22 11:51:59 UTC
Tvrtko, is there any reason this should be highest/critical? It doesn't look like our bug to me.
Comment 7 Tvrtko Ursulin 2016-04-22 11:57:48 UTC
Think I agree, I just filed it as such since it was found in, and was affecting CI and I thought highest/critical was the BKM was BAT/CI issues.
Comment 8 yann 2016-05-24 14:30:41 UTC

*** This bug has been marked as a duplicate of bug 86365 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.