Bug 109634 - [CI][DRMTIP] igt@gem_eio@reset-stress - incomplete - GEM_BUG_ON(!i915_request_completed(rq))
Summary: [CI][DRMTIP] igt@gem_eio@reset-stress - incomplete - GEM_BUG_ON(!i915_request...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-14 17:13 UTC by Lakshmi
Modified: 2019-03-06 15:44 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: GEM/Other


Attachments

Description Lakshmi 2019-02-14 17:13:16 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_218/fi-icl-u2/igt@gem_eio@reset-stress.html

 [132.315841] ------------[ cut here ]------------
<2> [132.315843] kernel BUG at drivers/gpu/drm/i915/intel_lrc.c:1103!
<4> [132.315862] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4> [132.315863] irq event stamp: 9823051
<4> [132.315868] hardirqs last  enabled at (9823051): [<ffffffff93236fdd>] __slab_alloc.isra.27.constprop.33+0x4d/0x70
<4> [132.315870] hardirqs last disabled at (9823050): [<ffffffff93236fa9>] __slab_alloc.isra.27.constprop.33+0x19/0x70
<4> [132.315873] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G     U            5.0.0-rc6-gc853cc6da521-drmtip_218+ #1
<4> [132.315885] softirqs last  enabled at (9822704): [<ffffffff93c0033a>] __do_softirq+0x33a/0x4b9
<4> [132.315892] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3071.A00.1902120336 02/12/2019
<4> [132.315903] softirqs last disabled at (9822697): [<ffffffff930b9a91>] irq_exit+0xd1/0xe0
<4> [132.315946] RIP: 0010:process_csb+0x640/0x9a0 [i915]
<4> [132.315947] Code: 7f 77 9f d2 48 8b 35 47 1a 1b 00 49 c7 c0 b6 84 82 c0 b9 4f 04 00 00 48 c7 c2 e0 be 80 c0 48 c7 c7 0b 18 73 c0 e8 30 0c a6 d2 <0f> 0b 48 c7 c1 10 71 84 c0 ba 29 04 00 00 48 c7 c6 e0 be 80 c0 48
<4> [132.315992] RSP: 0018:ffff99591ffc3e18 EFLAGS: 00010086
<4> [132.315999] RAX: 000000000000000b RBX: 00000000000002d8 RCX: 0000000000000000
<4> [132.316008] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffff99591d3bf4e8
<4> [132.316017] RBP: ffff99591ffc3e88 R08: 0000000000006665 R09: ffff99591cc00000
<4> [132.316025] R10: ffff99591ffc3da8 R11: ffff99591d3bf4e8 R12: 00000000000002d6
<4> [132.316032] R13: ffff9958ff1c42a8 R14: 0000000000000001 R15: ffff99590b680040
<4> [132.316038] FS:  0000000000000000(0000) GS:ffff99591ffc0000(0000) knlGS:0000000000000000
<4> [132.316045] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [132.316051] CR2: 00005645169159e0 CR3: 0000000492c18001 CR4: 0000000000760ee0
<4> [132.316057] PKRU: 55555554
<4> [132.316060] Call Trace:
<4> [132.316063]  <IRQ>
<4> [132.316099]  __execlists_submission_tasklet+0x2c/0xe10 [i915]
<4> [132.316145]  execlists_submission_tasklet+0x4c/0x60 [i915]
<4> [132.316153]  tasklet_action_common.isra.5+0x47/0xb0
<4> [132.316159]  __do_softirq+0xd8/0x4b9
<4> [132.316164]  ? _raw_spin_unlock+0x29/0x40
<4> [132.316169]  irq_exit+0xd1/0xe0
<4> [132.316173]  do_IRQ+0x9a/0x130
<4> [132.316178]  common_interrupt+0xf/0xf
<4> [132.316182]  </IRQ>
<4> [132.316186] RIP: 0010:cpuidle_enter_state+0xae/0x450
<4> [132.316191] Code: 44 00 00 31 ff e8 42 d6 92 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 78 03 00 00 31 ff e8 2b 80 99 ff e8 06 61 9d ff fb 45 85 ed <0f> 88 c9 02 00 00 4c 2b 24 24 48 ba cf f7 53 e3 a5 9b c4 20 49 63
<4> [132.316206] RSP: 0018:ffffa8a0c0143ea0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffdb
<4> [132.316213] RAX: ffff99591d290040 RBX: ffffffff942a07c0 RCX: 0000000000000000
<4> [132.316219] RDX: 0000000000000046 RSI: ffffffff940ffb7a RDI: ffffffff940aad8f
<4> [132.316225] RBP: ffff995919df9fe8 R08: 0000000000000002 R09: 0000000000000000
Comment 1 CI Bug Log 2019-02-14 17:15:55 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ICL: igt@gem_eio@reset-stress - incomplete - GEM_BUG_ON(!i915_request_completed(rq))
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_218/fi-icl-u2/igt@gem_eio@reset-stress.html
Comment 2 Chris Wilson 2019-02-15 13:38:23 UTC
With any luck, this should be covered by:

commit 9a3b19a16dc28ab717cf1663d09ffee0715b735a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Feb 13 23:20:47 2019 +0000

    drm/i915: Only try to park engines after a failed reset
    
    Currently we try to stop the engine by programming the ring registers to
    be disabled before we perform the reset. Sometimes, we see the context
    image also have invalid ring registers, which one presumes may be
    actually caused by us doing so. Lets risk not doing programming the
    ring to zero on the first attempt to avoid preserving that corruption
    into the context image, leaving the w/a in place for subsequent
    reset attempts.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@intel.com>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190213232047.8486-1-chris@chris-wilson.co.uk
Comment 3 CI Bug Log 2019-03-06 15:44:26 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.