Bug 106612 - [CI] igt@*(s3|suspend) - dmesg-warn - Delta way too big!
Summary: [CI] igt@*(s3|suspend) - dmesg-warn - Delta way too big!
Status: RESOLVED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 105762 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-05-22 07:47 UTC by Martin Peres
Modified: 2019-04-23 13:06 UTC (History)
3 users (show)

See Also:
i915 platform: CFL, ICL, PNV
i915 features: power/Other


Attachments

Description Martin Peres 2018-05-22 07:47:23 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_34/fi-pnv-d510/igt@gem_exec_suspend@basic-s3.html

[  201.321575] ------------[ cut here ]------------
[  201.321587] Delta way too big! 18446743859196209652 ts=18446744047386161939 write stamp = 188189952287
               If you just came from a suspend/resume,
               please switch to the trace global clock:
                 echo global > /sys/kernel/debug/tracing/trace_clock
               or add trace_clock=global to the kernel command line
[  201.321621] WARNING: CPU: 3 PID: 117 at kernel/trace/ring_buffer.c:2768 rb_handle_timestamp.isra.30+0x6c/0x70
[  201.321626] Modules linked in: vgem i915 coretemp snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel lpc_ich snd_hda_codec snd_hwdep snd_hda_core snd_pcm r8169 mii prime_numbers
[  201.321697] CPU: 3 PID: 117 Comm: kworker/u8:4 Tainted: G     U            4.17.0-rc4-g06e15c0f055b-drmtip_34+ #1
[  201.321703] Hardware name:  /D510MO, BIOS MOPNV10J.86A.0311.2010.0802.2346 08/02/2010
[  201.321713] Workqueue: events_unbound async_run_entry_fn
[  201.321726] RIP: 0010:rb_handle_timestamp.isra.30+0x6c/0x70
[  201.321733] RSP: 0018:ffffbb7c8033bb70 EFLAGS: 00010286
[  201.321742] RAX: 0000000000000000 RBX: ffffbb7c8033bb88 RCX: 0000000000000003
[  201.321748] RDX: 0000000080000003 RSI: 0000000000000001 RDI: 00000000ffffffff
[  201.321754] RBP: ffffa082fd337b10 R08: 0000000000000000 R09: 0000000000000000
[  201.321760] R10: ffffbb7c8033bb80 R11: ffffffff852444b8 R12: ffffa082fd018358
[  201.321766] R13: ffffa082fd337b30 R14: 00000000000003e8 R15: 0000000000000005
[  201.321773] FS:  0000000000000000(0000) GS:ffffa082fed80000(0000) knlGS:0000000000000000
[  201.321780] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  201.321785] CR2: 0000000000000000 CR3: 00000000371b2000 CR4: 00000000000006e0
[  201.321790] Call Trace:
[  201.321806]  ring_buffer_lock_reserve+0x210/0x3f0
[  201.322011]  ? i915_request_await_object+0x370/0x370 [i915]
[  201.322011]  trace_vbprintk+0xb4/0x220
[  201.322011]  __trace_bprintk+0x57/0x80
[  201.322011]  __i915_request_add+0x4f/0x7c0 [i915]
[  201.322011]  ? i915_request_alloc+0x2da/0x740 [i915]
[  201.322011]  i915_gem_switch_to_kernel_context+0x122/0x1a0 [i915]
[  201.322011]  ? pcie_port_bus_match+0x50/0x50
[  201.322011]  i915_gem_resume+0x9f/0xc0 [i915]
[  201.322011]  i915_pm_resume+0x7c/0x130 [i915]
[  201.322011]  dpm_run_callback+0x5d/0x2f0
[  201.322011]  device_resume+0xb3/0x1e0
[  201.322011]  ? dpm_watchdog_set+0x60/0x60
[  201.322011]  async_resume+0x14/0x40
[  201.322011]  async_run_entry_fn+0x34/0x160
[  201.322011]  process_one_work+0x229/0x6a0
[  201.322011]  worker_thread+0x35/0x380
[  201.322011]  ? process_one_work+0x6a0/0x6a0
[  201.322011]  kthread+0x119/0x130
[  201.322011]  ? _kthread_create_on_node+0x60/0x60
[  201.323250]  ret_from_fork+0x24/0x50
[  201.323250] Code: 24 48 8b 73 08 85 c0 48 8b 13 48 c7 c0 30 97 07 85 49 c7 c0 6e 55 08 85 4c 0f 44 c0 48 8b 0f 48 c7 c7 f0 97 07 85 e8 74 18 f3 ff <0f> 0b eb a8 41 54 55 48 63 f6 53 48 8b 87 a0 00 00 00 48 8b 1c 
[  201.323513] irq event stamp: 3250
[  201.323513] hardirqs last  enabled at (3249): [<ffffffff849487c4>] _raw_spin_unlock_irq+0x24/0x50
[  201.323513] hardirqs last disabled at (3250): [<ffffffff8494126a>] __schedule+0xaa/0xbe0
[  201.323513] softirqs last  enabled at (2922): [<ffffffff84c0032b>] __do_softirq+0x32b/0x4e1
[  201.323513] softirqs last disabled at (2915): [<ffffffff8408f6f4>] irq_exit+0xa4/0xb0
[  201.323513] WARNING: CPU: 3 PID: 117 at kernel/trace/ring_buffer.c:2768 rb_handle_timestamp.isra.30+0x6c/0x70
[  201.323513] ---[ end trace 8d91a263e8535748 ]---
Comment 1 Chris Wilson 2018-05-25 11:56:16 UTC
*** Bug 105762 has been marked as a duplicate of this bug. ***
Comment 2 Martin Peres 2018-09-07 17:05:21 UTC
It still happens on CFL: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_105/fi-cfl-s3/igt@kms_fbcon_fbt@fbc-suspend.html

It doesn't happen that often on PNV though... Not seen since drmtip_82.
Comment 3 Martin Peres 2018-10-12 14:40:44 UTC
Still happening: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_125/fi-cfl-s3/igt@kms_psr@suspend.html

<4> [416.251551] ------------[ cut here ]------------
<4> [416.251557] Delta way too big! 18446743620799766369 ts=18446744055896263471 write stamp = 435096497102\x0aIf you just came from a suspend/resume,\x0aplease switch to the trace global clock:\x0a  echo global > /sys/kernel/debug/tracing/trace_clock\x0aor add trace_clock=global to the kernel command line
<4> [416.251575] WARNING: CPU: 9 PID: 2469 at kernel/trace/ring_buffer.c:2771 rb_handle_timestamp.isra.30+0x6c/0x70
<4> [416.251578] Modules linked in: vgem i915 cdc_ether usbnet r8152 mii x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e mei_me mei prime_numbers
<4> [416.251630] CPU: 9 PID: 2469 Comm: kworker/u24:7 Tainted: G     U            4.19.0-rc7-g6c3870cc0454-drmtip_125+ #1
<4> [416.251633] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B19.1802080131 02/08/2018
<4> [416.251640] Workqueue: events_unbound async_run_entry_fn
<4> [416.251649] RIP: 0010:rb_handle_timestamp.isra.30+0x6c/0x70
<4> [416.251654] Code: 48 8b 73 08 85 c0 48 8b 13 48 c7 c0 d8 49 07 b5 49 c7 c0 f0 0a 08 b5 4c 0f 44 c0 48 8b 0f 48 c7 c7 98 4a 07 b5 e8 54 f5 f2 ff <0f> 0b eb a8 41 54 55 48 63 f6 53 48 8b 87 a0 00 00 00 48 8b 1c f0
<4> [416.251658] RSP: 0000:ffffbc2940877cd8 EFLAGS: 00010282
<4> [416.251664] RAX: 0000000000000000 RBX: ffffbc2940877cf0 RCX: 0000000000000002
<4> [416.251668] RDX: 0000000080000002 RSI: ffffffffb50f76e2 RDI: 00000000ffffffff
<4> [416.251672] RBP: ffffa231dbdbbb50 R08: 00000000f3233359 R09: 0000000000000000
<4> [416.251675] R10: ffffbc2940877ce8 R11: 0000000000000000 R12: ffffa231dbc18358
<4> [416.251679] R13: ffffa231dbdbbb70 R14: 00000000000003e8 R15: 0000000000000000
<4> [416.251683] FS:  0000000000000000(0000) GS:ffffa231ddc40000(0000) knlGS:0000000000000000
<4> [416.251687] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [416.251690] CR2: 0000000000000000 CR3: 0000000495210001 CR4: 00000000003606e0
<4> [416.251694] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4> [416.251698] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4> [416.251700] Call Trace:
<4> [416.251710]  ring_buffer_lock_reserve+0x210/0x3f0
<4> [416.251785]  ? intel_engines_reset_default_submission+0x30/0x30 [i915]
<4> [416.251793]  __trace_bputs+0x5b/0x130
<4> [416.251801]  ? __mutex_unlock_slowpath+0x46/0x2b0
<4> [416.251873]  intel_engines_sanitize+0x26/0x50 [i915]
<4> [416.251922]  i915_pm_resume_early+0x9c/0x110 [i915]
<4> [416.251971]  ? vlv_resume_prepare+0x670/0x670 [i915]
<4> [416.251979]  dpm_run_callback+0x64/0x280
<4> [416.251990]  device_resume_early+0xa6/0xe0
<4> [416.251999]  async_resume_early+0x14/0x40
<4> [416.252006]  async_run_entry_fn+0x34/0x160
<4> [416.252014]  process_one_work+0x245/0x610
<4> [416.252028]  worker_thread+0x37/0x380
<4> [416.252037]  ? process_one_work+0x610/0x610
<4> [416.252043]  kthread+0x119/0x130
<4> [416.252049]  ? kthread_park+0x80/0x80
<4> [416.252058]  ret_from_fork+0x3a/0x50
<4> [416.252067] irq event stamp: 9730
<4> [416.252067] hardirqs last  enabled at (9729): [<ffffffffb40fd8de>] vprintk_emit+0x2ee/0x310
<4> [416.252067] hardirqs last disabled at (9730): [<ffffffffb4001930>] trace_hardirqs_off_thunk+0x1a/0x1c
<4> [416.252067] softirqs last  enabled at (9558): [<ffffffffb4c0031d>] __do_softirq+0x31d/0x483
<4> [416.252067] softirqs last disabled at (9551): [<ffffffffb4090279>] irq_exit+0xa9/0xc0
<4> [416.252067] WARNING: CPU: 9 PID: 2469 at kernel/trace/ring_buffer.c:2771 rb_handle_timestamp.isra.30+0x6c/0x70
<4> [416.252067] ---[ end trace 4ffd148d29b62864 ]---
Comment 4 Martin Peres 2018-10-24 10:57:03 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_132/fi-icl-u/igt@drv_suspend@forcewake.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_132/fi-icl-u/igt@gem_eio@in-flight-suspend.html

<4> [350.625864] ------------[ cut here ]------------
<4> [350.625868] Delta way too big! 18446743711799753644 ts=18446744052547616611 write stamp = 340747862967\x0aIf you just came from a suspend/resume,\x0aplease switch to the trace global clock:\x0a  echo global > /sys/kernel/debug/tracing/trace_clock\x0aor add trace_clock=global to the kernel command line
<4> [350.625880] WARNING: CPU: 6 PID: 2430 at kernel/trace/ring_buffer.c:2771 rb_handle_timestamp.isra.30+0x6c/0x70
<4> [350.625886] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal ax88179_178a usbnet coretemp mii snd_hda_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm prime_numbers
<4> [350.625936] CPU: 6 PID: 2430 Comm: kworker/u16:45 Tainted: G     U  W         4.19.0-rc8-gae98bc614f3a-drmtip_132+ #1
<4> [350.625939] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2392.A04.1809260455 09/26/2018
<4> [350.625944] Workqueue: events_unbound async_run_entry_fn
<4> [350.625950] RIP: 0010:rb_handle_timestamp.isra.30+0x6c/0x70
<4> [350.625953] Code: 48 8b 73 08 85 c0 48 8b 13 48 c7 c0 58 4c 07 83 49 c7 c0 70 0d 08 83 4c 0f 44 c0 48 8b 0f 48 c7 c7 18 4d 07 83 e8 c4 f4 f2 ff <0f> 0b eb a8 41 54 55 48 63 f6 53 48 8b 87 a0 00 00 00 48 8b 1c f0
<4> [350.625956] RSP: 0000:ffffa10380bd7cd8 EFLAGS: 00010282
<4> [350.625961] RAX: 0000000000000000 RBX: ffffa10380bd7cf0 RCX: 0000000000000002
<4> [350.625964] RDX: 0000000080000002 RSI: ffffffff830f7992 RDI: 00000000ffffffff
<4> [350.625967] RBP: ffff932cae25a0c0 R08: 00000000f868858b R09: 0000000000000000
<4> [350.625969] R10: ffffa10380bd7ce8 R11: 0000000000000000 R12: ffff932cae018358
<4> [350.625972] R13: ffff932cae25a0e0 R14: 00000000000003e8 R15: 0000000000000000
<4> [350.625975] FS:  0000000000000000(0000) GS:ffff932caff80000(0000) knlGS:0000000000000000
<4> [350.625978] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [350.625981] CR2: 0000000000000000 CR3: 000000036e210001 CR4: 0000000000760ee0
<4> [350.625984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4> [350.625987] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4> [350.625989] PKRU: 00000000
<4> [350.625991] Call Trace:
<4> [350.625997]  ring_buffer_lock_reserve+0x210/0x3f0
<4> [350.626050]  ? intel_engines_reset_default_submission+0x30/0x30 [i915]
<4> [350.626056]  __trace_bputs+0x5b/0x130
<4> [350.626061]  ? __mutex_unlock_slowpath+0x46/0x2b0
<4> [350.626112]  intel_engines_sanitize+0x26/0x50 [i915]
<4> [350.626150]  i915_pm_resume_early+0x9c/0x110 [i915]
<4> [350.626189]  ? vlv_resume_prepare+0x670/0x670 [i915]
<4> [350.626195]  dpm_run_callback+0x64/0x280
<4> [350.626202]  device_resume_early+0xa6/0xe0
<4> [350.626209]  async_resume_early+0x14/0x40
<4> [350.626213]  async_run_entry_fn+0x34/0x160
<4> [350.626219]  process_one_work+0x245/0x610
<4> [350.626230]  worker_thread+0x37/0x380
<4> [350.626237]  ? process_one_work+0x610/0x610
<4> [350.626241]  kthread+0x119/0x130
<4> [350.626245]  ? kthread_park+0x80/0x80
<4> [350.626251]  ret_from_fork+0x3a/0x50
<4> [350.626265] irq event stamp: 620
<4> [350.626270] hardirqs last  enabled at (619): [<ffffffff820fd8ee>] vprintk_emit+0x2ee/0x310
<4> [350.626274] hardirqs last disabled at (620): [<ffffffff82001930>] trace_hardirqs_off_thunk+0x1a/0x1c
<4> [350.626278] softirqs last  enabled at (582): [<ffffffff82c0031d>] __do_softirq+0x31d/0x483
<4> [350.626282] softirqs last disabled at (575): [<ffffffff82090279>] irq_exit+0xa9/0xc0
<4> [350.626287] WARNING: CPU: 6 PID: 2430 at kernel/trace/ring_buffer.c:2771 rb_handle_timestamp.isra.30+0x6c/0x70
<4> [350.626289] ---[ end trace f410470ab7747c30 ]---
Comment 5 CI Bug Log 2019-01-04 15:30:59 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CFL ICL: igt@runner@aborted - fail - Previous test: (gem_workarounds \(suspend-resume\)|gem_ctx_isolation \(vecs0-s3\)|i915_suspend \(fence-restore-untiled\)|.) -}
{+ CFL ICL: igt@runner@aborted - fail - many reasons +}

 No new failures caught with the new filter
Comment 6 CI Bug Log 2019-01-25 14:04:22 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CFL ICL: igt@runner@aborted - fail - many reasons -}
{+ CFL ICL: igt@runner@aborted - fail - many reasons +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_196/fi-icl-u3/igt@runner@aborted.html
Comment 7 CI Bug Log 2019-02-07 14:40:02 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CFL ICL: igt@runner@aborted - fail - many reasons -}
{+ CFL ICL: igt@runner@aborted - fail - many reasons +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_210/fi-icl-u3/igt@runner@aborted.html
Comment 8 CI Bug Log 2019-02-08 15:16:44 UTC
A CI Bug Log filter associated to this bug has been updated:

{- PNV CFL ICL: igt@*(s3|suspend)* - dmesg-warn / dmesg-fail - Delta way too big! -}
{+ PNV CFL ICL: igt@*(s3|suspend)* - dmesg-warn / dmesg-fail - Delta way too big! +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_212/fi-icl-u3/igt@kms_flip@2x-flip-vs-suspend-interruptible.html
Comment 9 CI Bug Log 2019-02-08 15:17:42 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CFL ICL: igt@runner@aborted - fail - many reasons -}
{+ CFL ICL: igt@runner@aborted - fail - many reasons +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_212/fi-icl-u3/igt@runner@aborted.html
Comment 10 CI Bug Log 2019-02-08 15:21:52 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CFL ICL: igt@runner@aborted - fail - many reasons -}
{+ CFL ICL: igt@runner@aborted - fail - many reasons +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_212/fi-icl-u3/igt@runner@aborted.html
Comment 11 CI Bug Log 2019-02-08 15:28:38 UTC
A CI Bug Log filter associated to this bug has been updated:

{- PNV CFL ICL: igt@*(s3|suspend)* - dmesg-warn / dmesg-fail - Delta way too big! -}
{+ PNV CFL ICL: igt@*(s3|suspend)* - dmesg-warn / dmesg-fail - Delta way too big! +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_212/fi-icl-u3/igt@kms_psr@psr2_suspend.html
Comment 12 CI Bug Log 2019-02-08 15:28:59 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CFL ICL: igt@runner@aborted - fail - many reasons -}
{+ CFL ICL: igt@runner@aborted - fail - many reasons +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_212/fi-icl-u3/igt@runner@aborted.html
Comment 13 Imre Deak 2019-03-11 15:59:35 UTC
The following happens on ICL as well (besides PNV and CFL):

If you just came from a suspend/resume,
please switch to the trace global clock:
echo global > /sys/kernel/debug/tracing/trace_clock
or add trace_clock=global to the kernel command line

In bug 105762 Chris mentioned that this shouldn't happen on new hardware. Should the ICL cases be reported upstream then?
Comment 14 Chris Wilson 2019-03-11 16:03:53 UTC
Different issue; it's an unstable tsc clock source that is unexpected. The delta too big from ftrace across suspend is expected by ftrace at least.
Comment 15 Imre Deak 2019-03-11 16:16:16 UTC
(In reply to Chris Wilson from comment #14)
> Different issue; it's an unstable tsc clock source that is unexpected. The
> delta too big from ftrace across suspend is expected by ftrace at least.

Ah, ok, so here (on all of PNV, CFL, ICL) we should just do what dmesg tells us.
Comment 16 CI Bug Log 2019-04-23 13:06:26 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.
Comment 17 Martin Peres 2019-04-23 13:06:58 UTC
This seems fixed anyway! No failures after 460 runs, when it used to fail roughly every other run...


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.