Bug 111514 - [CI][DRMTIP]igt@i915_selftest@live_gem_contexts - incomplete - GEM_BUG_ON(!i915_request_completed(*execlists->active) && !reset_in_progress(execlists))
Summary: [CI][DRMTIP]igt@i915_selftest@live_gem_contexts - incomplete - GEM_BUG_ON(!i9...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 111519 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-08-29 06:51 UTC by Lakshmi
Modified: 2019-09-10 10:09 UTC (History)
2 users (show)

See Also:
i915 platform: CFL
i915 features: GEM/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lakshmi 2019-08-29 06:51:19 UTC
[372.911925] process_csb:1565 GEM_BUG_ON(!i915_request_completed(*execlists->active) && !reset_in_progress(execlists))
<4> [372.911951] ------------[ cut here ]------------
<2> [372.911953] kernel BUG at drivers/gpu/drm/i915/gt/intel_lrc.c:1565!
<4> [372.911987] invalid opcode: 0000 [#1] PREEMPT SMP PTI
<4> [372.911989] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G     U            5.3.0-rc6-CI-CI_DRM_6800+ #1
<4> [372.911990] Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.10 12/28/2017
<4> [372.912048] RIP: 0010:process_csb+0x99f/0xbc0 [i915]
<4> [372.912052] Code: e0 49 f3 e0 48 8b 35 e8 1d 24 00 49 c7 c0 c8 93 36 a0 b9 1d 06 00 00 48 c7 c2 20 26 34 a0 48 c7 c7 da 8f 1f a0 e8 61 2d fa e0 <0f> 0b 49 89 c5 e9 9e f8 ff ff 48 c7 c1 f0 92 36 a0 ba 7a 02 00 00
<4> [372.912054] RSP: 0018:ffffc90000160ea0 EFLAGS: 00010286
<4> [372.912056] RAX: 0000000000000018 RBX: ffff8881c903c2a8 RCX: 0000000000000000
<4> [372.912057] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 0000000000000624
<4> [372.912057] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000624
<4> [372.912058] R10: 00000000ac52aab2 R11: ffff888264cf0558 R12: 0000000000000002
<4> [372.912059] R13: 0000000000000000 R14: 0000000000000004 R15: ffff8881645bd040
<4> [372.912060] FS:  0000000000000000(0000) GS:ffff888266700000(0000) knlGS:0000000000000000
<4> [372.912061] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [372.912062] CR2: 00007f48526b7d58 CR3: 0000000005210005 CR4: 00000000003606e0
<4> [372.912064] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4> [372.912066] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4> [372.912068] Call Trace:
<4> [372.912071]  <IRQ>
<4> [372.912110]  execlists_submission_tasklet+0xc/0x50 [i915]
<4> [372.912114]  tasklet_action_common.isra.5+0x47/0xb0
<4> [372.912116]  __do_softirq+0xd8/0x4ae
<4> [372.912118]  irq_exit+0xa9/0xc0
<4> [372.912120]  do_IRQ+0xb8/0x160
<4> [372.912122]  common_interrupt+0xf/0xf
<4> [372.912123]  </IRQ>
<4> [372.912125] RIP: 0010:cpuidle_enter_state+0xae/0x450
<4> [372.912127] Code: 44 00 00 31 ff e8 72 31 91 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 78 03 00 00 31 ff e8 5b 41 98 ff e8 c6 53 9c ff fb 45 85 ed <0f> 88 c9 02 00 00 4c 2b 24 24 48 ba cf f7 53 e3 a5 9b c4 20 49 63
<4> [372.912128] RSP: 0018:ffffc900000b7e80 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffde
<4> [372.912129] RAX: ffff8882656dd040 RBX: ffffffff822a0760 RCX: 0000000000000002
<4> [372.912130] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffffffff821366f9
<4> [372.912131] RBP: ffffe8ffff900c90 R08: 0000000000000002 R09: 0000000000000000
<4> [372.912131] R10: 0000000000000000 R11: 0000000000000000 R12: 00000056d34847e9
<4> [372.912132] R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000006
<4> [372.912136]  cpuidle_enter+0x24/0x40
<4> [372.912137]  do_idle+0x1f3/0x260
<4> [372.912139]  cpu_startup_entry+0x14/0x20
Comment 1 CI Bug Log 2019-08-29 06:52:37 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* CFL: igt@i915_selftest@live_gem_contexts - incomplete - GEM_BUG_ON(!i915_request_completed(*execlists-&gt;active) &amp;&amp; !reset_in_progress(execlists))
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6800/fi-cfl-guc/igt@i915_selftest@live_gem_contexts.html
Comment 2 Lakshmi 2019-08-29 06:53:20 UTC
(In reply to Lakshmi from comment #0)
> [372.911925] process_csb:1565
> GEM_BUG_ON(!i915_request_completed(*execlists->active) &&
> !reset_in_progress(execlists))
> <4> [372.911951] ------------[ cut here ]------------
> <2> [372.911953] kernel BUG at drivers/gpu/drm/i915/gt/intel_lrc.c:1565!
> <4> [372.911987] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> <4> [372.911989] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G     U           
> 5.3.0-rc6-CI-CI_DRM_6800+ #1
> <4> [372.911990] Hardware name: Micro-Star International Co., Ltd.
> MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.10 12/28/2017
> <4> [372.912048] RIP: 0010:process_csb+0x99f/0xbc0 [i915]
> <4> [372.912052] Code: e0 49 f3 e0 48 8b 35 e8 1d 24 00 49 c7 c0 c8 93 36 a0
> b9 1d 06 00 00 48 c7 c2 20 26 34 a0 48 c7 c7 da 8f 1f a0 e8 61 2d fa e0 <0f>
> 0b 49 89 c5 e9 9e f8 ff ff 48 c7 c1 f0 92 36 a0 ba 7a 02 00 00
> <4> [372.912054] RSP: 0018:ffffc90000160ea0 EFLAGS: 00010286
> <4> [372.912056] RAX: 0000000000000018 RBX: ffff8881c903c2a8 RCX:
> 0000000000000000
> <4> [372.912057] RDX: 0000000000000001 RSI: 0000000000000004 RDI:
> 0000000000000624
> <4> [372.912057] RBP: 0000000000000004 R08: 0000000000000000 R09:
> 0000000000000624
> <4> [372.912058] R10: 00000000ac52aab2 R11: ffff888264cf0558 R12:
> 0000000000000002
> <4> [372.912059] R13: 0000000000000000 R14: 0000000000000004 R15:
> ffff8881645bd040
> <4> [372.912060] FS:  0000000000000000(0000) GS:ffff888266700000(0000)
> knlGS:0000000000000000
> <4> [372.912061] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4> [372.912062] CR2: 00007f48526b7d58 CR3: 0000000005210005 CR4:
> 00000000003606e0
> <4> [372.912064] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> <4> [372.912066] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> <4> [372.912068] Call Trace:
> <4> [372.912071]  <IRQ>
> <4> [372.912110]  execlists_submission_tasklet+0xc/0x50 [i915]
> <4> [372.912114]  tasklet_action_common.isra.5+0x47/0xb0
> <4> [372.912116]  __do_softirq+0xd8/0x4ae
> <4> [372.912118]  irq_exit+0xa9/0xc0
> <4> [372.912120]  do_IRQ+0xb8/0x160
> <4> [372.912122]  common_interrupt+0xf/0xf
> <4> [372.912123]  </IRQ>
> <4> [372.912125] RIP: 0010:cpuidle_enter_state+0xae/0x450
> <4> [372.912127] Code: 44 00 00 31 ff e8 72 31 91 ff 45 84 f6 74 12 9c 58 f6
> c4 02 0f 85 78 03 00 00 31 ff e8 5b 41 98 ff e8 c6 53 9c ff fb 45 85 ed <0f>
> 88 c9 02 00 00 4c 2b 24 24 48 ba cf f7 53 e3 a5 9b c4 20 49 63
> <4> [372.912128] RSP: 0018:ffffc900000b7e80 EFLAGS: 00000206 ORIG_RAX:
> ffffffffffffffde
> <4> [372.912129] RAX: ffff8882656dd040 RBX: ffffffff822a0760 RCX:
> 0000000000000002
> <4> [372.912130] RDX: 0000000000000046 RSI: 0000000000000006 RDI:
> ffffffff821366f9
> <4> [372.912131] RBP: ffffe8ffff900c90 R08: 0000000000000002 R09:
> 0000000000000000
> <4> [372.912131] R10: 0000000000000000 R11: 0000000000000000 R12:
> 00000056d34847e9
> <4> [372.912132] R13: 0000000000000006 R14: 0000000000000000 R15:
> 0000000000000006
> <4> [372.912136]  cpuidle_enter+0x24/0x40
> <4> [372.912137]  do_idle+0x1f3/0x260
> <4> [372.912139]  cpu_startup_entry+0x14/0x20

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6800/fi-cfl-guc/igt@i915_selftest@live_gem_contexts.html
Comment 3 Chris Wilson 2019-08-29 12:58:37 UTC
*** Bug 111519 has been marked as a duplicate of this bug. ***
Comment 4 Chris Wilson 2019-09-10 10:09:36 UTC
Perhaps a little optimistic, but it looks very promising,

commit fa9a09f15065650d97e3b1336d11e4ad9672b759 (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Sep 10 09:02:08 2019 +0100

    drm/i915/execlists: Clear STOP_RING bit on reset
    
    During reset, we try to ensure no forward progress of the CS prior to
    the reset by setting the STOP_RING bit in RING_MI_MODE. Since gen9, this
    register is context saved and do we end up in the odd situation where we
    save the STOP_RING bit and so try to stop the engine again immediately
    upon resume. This is quite unexpected and causes us to complain about an
    early CS completion event!
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111514
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190910080208.4223-1-chris@chris-wilson.co.uk


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.