Bug 109536 - [CI][DRMTIP] igt@i915_suspend@debugfs-reader - dmesg-warn - Unexpected send: action=0x30
Summary: [CI][DRMTIP] igt@i915_suspend@debugfs-reader - dmesg-warn - Unexpected send: ...
Status: REOPENED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 109537 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-02-03 12:41 UTC by Martin Peres
Modified: 2019-03-06 17:27 UTC (History)
1 user (show)

See Also:
i915 platform: KBL
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-02-03 12:41:38 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_207/fi-kbl-guc/igt@i915_suspend@debugfs-reader.html

<4> [46.791873] ------------[ cut here ]------------
<4> [46.791881] Unexpected send: action=0x30
<4> [46.791917] WARNING: CPU: 6 PID: 397 at drivers/gpu/drm/i915/intel_guc.c:373 intel_guc_send_nop+0xe/0x20 [i915]
<4> [46.791918] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal snd_hda_intel coretemp snd_hda_codec snd_hwdep crct10dif_pclmul snd_hda_core crc32_pclmul ghash_clmulni_intel snd_pcm e1000e mei_me mei i2c_i801 prime_numbers
<4> [46.791927] CPU: 6 PID: 397 Comm: kworker/u17:0 Tainted: G     U            5.0.0-rc4-gfd25766ba02f-drmtip_207+ #1
<4> [46.791928] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018
<4> [46.791950] Workqueue: i915-guc_log capture_logs_work [i915]
<4> [46.791972] RIP: 0010:intel_guc_send_nop+0xe/0x20 [i915]
<4> [46.791974] Code: 01 00 00 00 be c8 c4 00 00 e9 8e 16 95 e9 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 8b 36 48 c7 c7 82 0c 39 c0 e8 a2 22 e0 e8 <0f> 0b b8 ed ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 48 c7 c7 f8
<4> [46.791975] RSP: 0018:ffffa390003dfdb8 EFLAGS: 00010286
<4> [46.791976] RAX: 0000000000000000 RBX: 00000000a0600001 RCX: 0000000000000000
<4> [46.791977] RDX: 0000000000000007 RSI: ffffffffaa0878eb RDI: 00000000ffffffff
<4> [46.791979] RBP: 0000000000013000 R08: 000000000d47896a R09: 0000000000000000
<4> [46.791980] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000002000
<4> [46.791981] R13: ffffa390008d1060 R14: ffff93f0cee11380 R15: ffff93f0cee10000
<4> [46.791982] FS:  0000000000000000(0000) GS:ffff93f0eeb80000(0000) knlGS:0000000000000000
<4> [46.791983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [46.791984] CR2: 00007f64864fb788 CR3: 00000000b0214002 CR4: 00000000003606e0
<4> [46.791985] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4> [46.791986] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4> [46.791987] Call Trace:
<4> [46.792009]  guc_log_capture_logs+0x2d9/0x460 [i915]
<4> [46.792016]  process_one_work+0x245/0x610
<4> [46.792020]  worker_thread+0x1d0/0x380
<4> [46.792022]  ? process_one_work+0x610/0x610
<4> [46.792024]  kthread+0x119/0x130
<4> [46.792026]  ? kthread_park+0x80/0x80
<4> [46.792029]  ret_from_fork+0x3a/0x50
<4> [46.792034] irq event stamp: 98
<4> [46.792037] hardirqs last  enabled at (97): [<ffffffffa9126f0e>] console_unlock+0x3fe/0x600
<4> [46.792038] hardirqs last disabled at (98): [<ffffffffa90019b0>] trace_hardirqs_off_thunk+0x1a/0x1c
<4> [46.792040] softirqs last  enabled at (84): [<ffffffffa9c0033a>] __do_softirq+0x33a/0x4b9
<4> [46.792042] softirqs last disabled at (73): [<ffffffffa90b9a41>] irq_exit+0xd1/0xe0
<4> [46.792063] WARNING: CPU: 6 PID: 397 at drivers/gpu/drm/i915/intel_guc.c:373 intel_guc_send_nop+0xe/0x20 [i915]
<4> [46.792064] ---[ end trace 7c1cf8091b0b22a9 ]---
Comment 1 Chris Wilson 2019-02-03 16:36:17 UTC
*** Bug 109537 has been marked as a duplicate of this bug. ***
Comment 2 Chris Wilson 2019-02-04 08:54:10 UTC
The guc is in a reset state (from an earlier failure leaving the driver wedged) and not able to respond. Either this should not be a warn, or the logging should be more intelligent about when guc is disabled.
Comment 3 Chris Wilson 2019-02-26 10:45:50 UTC
Worth a shot to see if this is fixed by

commit 07c100b187332101220baf7446b4f09296d7c59b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Feb 21 16:38:33 2019 +0000

    drm/i915/guc: Flush the residual log capture irq on disabling
    
    As we disable the log capture events, flush any residual interrupt
    before we flush and disable the worker.
    
    v2: Mika pointed out that it wasn't the worker re-queueing itself, but a
    rogue irq.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109716
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190221163833.21393-1-chris@chris-wilson.co.uk
Comment 4 Martin Peres 2019-03-06 17:27:45 UTC
(In reply to Chris Wilson from comment #3)
> Worth a shot to see if this is fixed by
> 
> commit 07c100b187332101220baf7446b4f09296d7c59b
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Thu Feb 21 16:38:33 2019 +0000
> 
>     drm/i915/guc: Flush the residual log capture irq on disabling
>     
>     As we disable the log capture events, flush any residual interrupt
>     before we flush and disable the worker.
>     
>     v2: Mika pointed out that it wasn't the worker re-queueing itself, but a
>     rogue irq.
>     
>     Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109716
>     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>     Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>     Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>     Link:
> https://patchwork.freedesktop.org/patch/msgid/20190221163833.21393-1-
> chris@chris-wilson.co.uk

Seems like it might have reduced the reproduction rate, but unfortunately, this is still happening: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_234/fi-kbl-guc/igt@i915_suspend@debugfs-reader.html


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.