Bug 106977 - [CI] igt@gem_exec_schedule@preempt-* - dmesg-warn - WARN_ON_ONCE((1) && !(preempt_count() != 0))
Summary: [CI] igt@gem_exec_schedule@preempt-* - dmesg-warn - WARN_ON_ONCE((1) && !(pre...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-06-20 13:28 UTC by Martin Peres
Modified: 2018-06-24 15:59 UTC (History)
1 user (show)

See Also:
i915 platform: KBL
i915 features: firmware/guc


Attachments

Description Martin Peres 2018-06-20 13:28:12 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_67/fi-kbl-guc/igt@gem_exec_schedule@preempt-queue-contexts-chain-bsd.html

[   59.939080] ------------[ cut here ]------------
[   59.939091] WARN_ON_ONCE((1) && !(preempt_count() != 0))
[   59.939142] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915]
[   59.939143] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers
[   59.939164] CPU: 1 PID: 2901 Comm: gem_exec_schedu Tainted: G     U  W         4.18.0-rc1-g93475d62c730-drmtip_67+ #1
[   59.939165] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018
[   59.939188] RIP: 0010:guc_submission_tasklet+0x784/0xa90 [i915]
[   59.939189] Code: fc ff ff 80 3d 2f 87 11 00 00 0f 85 80 fb ff ff 48 c7 c6 f8 49 40 c0 48 c7 c7 80 41 3e c0 c6 05 14 87 11 00 01 e8 2c ea d6 d3 <0f> 0b e9 5f fb ff ff 8b 46 38 89 cf 31 c7 83 e7 c0 75 08 39 c1 0f 
[   59.939253] RSP: 0018:ffffaafe08a03c10 EFLAGS: 00010286
[   59.939255] RAX: 0000000000000000 RBX: ffff8f9112c246f0 RCX: 0000000000000001
[   59.939256] RDX: 0000000080000001 RSI: ffffffff95086d8e RDI: 00000000ffffffff
[   59.939257] RBP: ffff8f9112c24680 R08: 000000009517be77 R09: 0000000000000000
[   59.939258] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f9112c24700
[   59.939259] R13: ffff8f9112c24700 R14: 0000000000000000 R15: ffff8f9112c242a8
[   59.939260] FS:  00007fc2cc7e5980(0000) GS:ffff8f9136c40000(0000) knlGS:0000000000000000
[   59.939261] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.939262] CR2: 00007fc2cc815040 CR3: 000000021f10e003 CR4: 00000000003606e0
[   59.939263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.939264] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.939265] Call Trace:
[   59.939288]  ? intel_engine_is_idle+0x64/0x160 [i915]
[   59.939323]  ? intel_engine_dump+0x638/0x890 [i915]
[   59.939327]  ? seq_printf+0x49/0x70
[   59.939353]  ? i915_engine_info+0xc8/0x100 [i915]
[   59.939356]  ? drm_get_color_range_name+0x20/0x20
[   59.939361]  ? seq_read+0xf1/0x470
[   59.939365]  ? trace_hardirqs_on_caller+0xe0/0x1b0
[   59.939370]  ? full_proxy_read+0x51/0x80
[   59.939389]  ? __vfs_read+0x31/0x170
[   59.939395]  ? do_sys_open+0x13b/0x240
[   59.939398]  ? rcu_read_lock_sched_held+0x6f/0x80
[   59.939401]  ? vfs_read+0x9e/0x140
[   59.939404]  ? ksys_read+0x50/0xc0
[   59.939409]  ? do_syscall_64+0x55/0x190
[   59.939412]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   59.939420] irq event stamp: 552834
[   59.939422] hardirqs last  enabled at (552833): [<ffffffff940fc74c>] console_unlock+0x3fc/0x600
[   59.939425] hardirqs last disabled at (552834): [<ffffffff94a0111c>] error_entry+0x7c/0x100
[   59.939451] softirqs last  enabled at (552614): [<ffffffffc02e0f53>] i915_request_add+0x2e3/0x7b0 [i915]
[   59.939470] softirqs last disabled at (552604): [<ffffffffc02e0ecb>] i915_request_add+0x25b/0x7b0 [i915]
[   59.939490] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915]
[   59.939491] ---[ end trace 80fbfe045cb87f7c ]---
Comment 1 Chris Wilson 2018-06-20 13:51:57 UTC
So that is wait_for_atomic() complaining about the _WAIT_FOR_ATOMIC_CHECK() failing, i..e that we used wait_for_atomic() from an ordinary process context.

So normally the tasklet would be called from softirq context but in this case, it's a call by intel_engine_is_idle() doing the direct check for fast idling. Ok, seems easy to fix...
Comment 2 Chris Wilson 2018-06-20 21:27:03 UTC
commit 26eb4cd6c7c7793ee76c73b66621f63daff51953
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jun 20 14:59:29 2018 +0100

    drm/i915: Disable bh around call to tasklet
    
    The guc submission backends expects to only be run from (at least)
    softirq context, but during our intel_engine_is_idle() check we would
    call into the tasklet to make sure it was flushed. As this could occur
    from process context, occasionally we would be caught out using a
    wait_for_atomic() not from an atomic context:
    
    [   59.939091] WARN_ON_ONCE((1) && !(preempt_count() != 0))
    [   59.939142] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915]
    [   59.939143] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers
    [   59.939164] CPU: 1 PID: 2901 Comm: gem_exec_schedu Tainted: G     U  W         4.18.0-rc1-g93475d62c730-drmtip_67+ #1
    [   59.939165] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018
    [   59.939188] RIP: 0010:guc_submission_tasklet+0x784/0xa90 [i915]
    [   59.939189] Code: fc ff ff 80 3d 2f 87 11 00 00 0f 85 80 fb ff ff 48 c7 c6 f8 49 40 c0 48 c7 c7 80 41 3e c0 c6 05 14 87 11 00 01 e8 2c ea d6 d3 <0f> 0b e9 5f fb ff ff 8b 46 38 89 cf 31 c7 83 e7 c0 75 08 39 c1 0f
    [   59.939253] RSP: 0018:ffffaafe08a03c10 EFLAGS: 00010286
    [   59.939255] RAX: 0000000000000000 RBX: ffff8f9112c246f0 RCX: 0000000000000001
    [   59.939256] RDX: 0000000080000001 RSI: ffffffff95086d8e RDI: 00000000ffffffff
    [   59.939257] RBP: ffff8f9112c24680 R08: 000000009517be77 R09: 0000000000000000
    [   59.939258] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f9112c24700
    [   59.939259] R13: ffff8f9112c24700 R14: 0000000000000000 R15: ffff8f9112c242a8
    [   59.939260] FS:  00007fc2cc7e5980(0000) GS:ffff8f9136c40000(0000) knlGS:0000000000000000
    [   59.939261] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   59.939262] CR2: 00007fc2cc815040 CR3: 000000021f10e003 CR4: 00000000003606e0
    [   59.939263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [   59.939264] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [   59.939265] Call Trace:
    [   59.939288]  ? intel_engine_is_idle+0x64/0x160 [i915]
    [   59.939323]  ? intel_engine_dump+0x638/0x890 [i915]
    [   59.939327]  ? seq_printf+0x49/0x70
    [   59.939353]  ? i915_engine_info+0xc8/0x100 [i915]
    [   59.939356]  ? drm_get_color_range_name+0x20/0x20
    [   59.939361]  ? seq_read+0xf1/0x470
    [   59.939365]  ? trace_hardirqs_on_caller+0xe0/0x1b0
    [   59.939370]  ? full_proxy_read+0x51/0x80
    [   59.939389]  ? __vfs_read+0x31/0x170
    [   59.939395]  ? do_sys_open+0x13b/0x240
    [   59.939398]  ? rcu_read_lock_sched_held+0x6f/0x80
    [   59.939401]  ? vfs_read+0x9e/0x140
    [   59.939404]  ? ksys_read+0x50/0xc0
    [   59.939409]  ? do_syscall_64+0x55/0x190
    [   59.939412]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [   59.939420] irq event stamp: 552834
    [   59.939422] hardirqs last  enabled at (552833): [<ffffffff940fc74c>] console_unlock+0x3fc/0x600
    [   59.939425] hardirqs last disabled at (552834): [<ffffffff94a0111c>] error_entry+0x7c/0x100
    [   59.939451] softirqs last  enabled at (552614): [<ffffffffc02e0f53>] i915_request_add+0x2e3/0x7b0 [i915]
    [   59.939470] softirqs last disabled at (552604): [<ffffffffc02e0ecb>] i915_request_add+0x25b/0x7b0 [i915]
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106977
    Fixes: dd0cf235d81f ("drm/i915: Speed up idle detection by kicking the tasklets")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
    Cc: Michał Winiarski <michal.winiarski@intel.com>
    Cc: Michel Thierry <michel.thierry@intel.com>
    Reviewed-by: Michel Thierry <michel.thierry@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180620135929.23956-1-chris@chris-wilson.co.uk
Comment 3 Martin Peres 2018-06-20 21:36:55 UTC
Thank you Chris! You are always impressive, to say the least :)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.