On CI on Joule bxt-t5700 gem_exec_suspend@basic-s4-devices causes dmesg warning https://intel-gfx-ci.01.org/CI/CI_DRM_2306/fi-bxt-t5700/igt@gem_exec_suspend@basic-s4-devices.html Dmesg [ 468.379071] Suspending console(s) (use no_console_suspend to debug) [ 473.772679] usb usb1: root hub lost power or was reset [ 473.772881] usb usb2: root hub lost power or was reset [ 474.370142] ------------[ cut here ]------------ [ 474.370183] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x225/0x230 [ 474.370197] Modules linked in: ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core mei_me snd_pcm mei mmc_block i915 sdhci_pci sdhci mmc_core prime_numbers i2c_hid pinctrl_broxton pinctrl_intel [ 474.370409] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc1-CI-CI_DRM_2306+ #1 [ 474.370412] Hardware name: Intel Corp. Broxton M/SDS, BIOS GTPPA16A.X64.0143.B30.1608112014 08/11/2016 [ 474.370414] Call Trace: [ 474.370417] <IRQ> [ 474.370425] dump_stack+0x67/0x92 [ 474.370432] __warn+0xc6/0xe0 [ 474.370437] warn_slowpath_fmt+0x4a/0x50 [ 474.370445] dev_watchdog+0x225/0x230 [ 474.370449] ? qdisc_rcu_free+0x40/0x40 [ 474.370452] ? qdisc_rcu_free+0x40/0x40 [ 474.370456] call_timer_fn+0x92/0x380 [ 474.370459] ? process_timeout+0x10/0x10 [ 474.370463] ? qdisc_rcu_free+0x40/0x40 [ 474.370467] expire_timers+0x150/0x1f0 [ 474.370472] run_timer_softirq+0x7c/0x160 [ 474.370480] __do_softirq+0x116/0x4c0 [ 474.370486] irq_exit+0xa9/0xc0 [ 474.370491] smp_apic_timer_interrupt+0x38/0x50 [ 474.370496] apic_timer_interrupt+0x90/0xa0 [ 474.370502] RIP: 0010:cpuidle_enter_state+0x135/0x380 [ 474.370505] RSP: 0018:ffffc90000087e88 EFLAGS: 00000216 ORIG_RAX: ffffffffffffff10 [ 474.370510] RAX: ffff88017a878040 RBX: 00000000080f0a0c RCX: 0000000000000001 [ 474.370512] RDX: 0000000000000000 RSI: ffffffff81ca163e RDI: ffffffff81c7ce58 [ 474.370515] RBP: ffffc90000087ec0 R08: ffff88017fd16f84 R09: 0000000000000018 [ 474.370517] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000007 [ 474.370520] R13: ffff88017fd236e0 R14: ffffffff81ec5798 R15: 0000006e6a99133f [ 474.370522] </IRQ> [ 474.370532] ? cpuidle_enter_state+0x131/0x380 [ 474.370538] cpuidle_enter+0x12/0x20 [ 474.370542] call_cpuidle+0x1e/0x40 [ 474.370545] do_idle+0x17e/0x1f0 [ 474.370549] cpu_startup_entry+0x18/0x20 [ 474.370553] start_secondary+0x102/0x120 [ 474.370559] start_cpu+0x14/0x14 [ 474.370568] ---[ end trace e3af8012fdbe43a9 ]--- [ 480.516639] xhci_hcd 0000:00:15.0: WARN: unexpected TRB Type 4
Dmesg: https://intel-gfx-ci.01.org/CI/CI_DRM_2306/fi-bxt-t5700/dmesg-during.log
[ 474.370191] NETDEV WATCHDOG: enx000acd2892fb (ax88179_178a): transmit queue 0 timed out
Still seen: https://intel-gfx-ci.01.org/CI/CI_DRM_2382/fi-bxt-t5700/igt@gem_exec_suspend@basic-s4-devices.html
Raising the priority, because it reduces our code coverage. Failure rate 16/123 run(s) (13%)
*** Bug 100428 has been marked as a duplicate of this bug. ***
Also seen on fi-snb-2600 and fi-kbl-7560u.
Updated failing statistics: - bxt-t5700: Failure rate 21/184 run(s) (11%) - fi-kbl-7560u: Failure rate 16/41 run(s) (39%) - fi-snb-2600: Failure rate 2/22 run(s) (9%)
KBL is failing for a different reason in s4: [ 272.313155] [drm:intel_sbi_read [i915]] *ERROR* error during SBI read of reg 2a00 [ 272.313182] [drm:intel_sbi_write [i915]] *ERROR* error during SBI write of 0 to reg 2a00
Now different reason for KBL 7500u/igt@gem_exec_suspend@basic-s4-devices https://intel-gfx-ci.01.org/CI/CI_DRM_2569/fi-kbl-7500u/igt@gem_exec_suspend@basic-s4-devices.html [ 242.936931] [drm:intel_dp_aux_ch [i915]] *ERROR* dp aux hw did not signal timeout (has irq: 1)! [ 242.936950] [drm:intel_dp_aux_ch [i915]] *ERROR* dp_aux_ch not done status 0xac1003ff
Jani - please create a new bug for this new failure. Let' not mix several things in one bug.
Yep, will do.
(In reply to Jani Saarinen from comment #11) > Yep, will do. Will be followed on https://bugs.freedesktop.org/show_bug.cgi?id=100904
Adding tag into "Whiteboard" field - ReadyForDev The bug still active *Status is correct *Platform is included *Feature is included *Priority and Severity correctly set
Tomi replaced the last ax88179_178a USB-net dongle in CI yesterday, so this particular warning *should* be fixed (of course the "real" fix would be for the ax88179_178a driver to handle power management properly, but that's out of our hands). The "error during SBI read" is a different issue and should be reported separately (I can only see that one in the logs for BDW GVT-D though, not KBL?). SBI_DBUFF0 (0x2a00) seems to be specific to LynxPoint though, it shouldn't be possible on anything else than Haswell & Broadwell. @Marten: Do you have a link to logs where the SBI error occurred on KBL? Tentatively marking this one as fixed.
Still issue on SNB: https://patchwork.freedesktop.org/series/24635/ Test gem_exec_suspend: Subgroup basic-s4-devices: pass -> DMESG-WARN (fi-snb-2600) fdo#100125 Maybe KBL now fixed.
This still very much is a problem on more than one platform: - https://intel-gfx-ci.01.org/CI/CI_DRM_2644/fi-kbl-7560u/igt@gem_exec_suspend@basic-s4-devices.html - https://intel-gfx-ci.01.org/CI/CI_DRM_2627/fi-snb-2600/igt@gem_exec_suspend@basic-s4-devices.html
Still issues seen: https://intel-gfx-ci.01.org/CI/igt@gem_exec_suspend@basic-s4-devices.html
still problem on KBL and SKL (SNB to be followed) fi-kbl-r: 30 minutes / 0 runs ago, with result 'dmesg-warn' fi-kbl-7560u: 3 hours / 1 run ago, with result 'dmesg-warn' fi-skl-6600u: 1 day / 5 runs ago, with result 'dmesg-warn' fi-snb-2600: 2017-06-21, with result 'dmesg-warn' Removing BXT from the platforms.
This bug is not going anywhere, so I moved it here: https://bugzilla.kernel.org/show_bug.cgi?id=196399
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.