Bug 111639 - [CI][RESUME] igt@gem_vm_create@isolation - dmesg-warn - GEM_BUG_ON(!intel_context_is_pinned(ce))
Summary: [CI][RESUME] igt@gem_vm_create@isolation - dmesg-warn - GEM_BUG_ON(!intel_con...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: high not set
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-09-11 05:35 UTC by Martin Peres
Modified: 2019-09-18 07:29 UTC (History)
1 user (show)

See Also:
i915 platform: TGL
i915 features: GEM/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-09-11 05:35:02 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt%40gem_vm_create%40isolation.html

<3> [62.231256] __execlists_reset:2398 GEM_BUG_ON(!intel_context_is_pinned(ce))
<4> [62.231438] ------------[ cut here ]------------
<2> [62.231444] kernel BUG at drivers/gpu/drm/i915/gt/intel_lrc.c:2398!
<4> [62.231500] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4> [62.231513] CPU: 0 PID: 1010 Comm: gem_vm_create Tainted: G     U            5.3.0-rc7-gedc92c969391-drmtip_364+ #1
<4> [62.231533] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.2321.A01.1908052106 08/05/2019
<4> [62.231640] RIP: 0010:__execlists_reset+0x768/0xb50 [i915]
<4> [62.231655] Code: a7 6c fd dc 48 8b 35 ff 00 24 00 49 c7 c0 d9 29 30 c0 b9 5e 09 00 00 48 c7 c2 90 26 2a c0 48 c7 c7 f3 ab 15 c0 e8 38 50 04 dd <0f> 0b 48 c7 c1 d0 93 2c c0 ba 7a 02 00 00 48 c7 c6 50 26 2a c0 48
<4> [62.231683] RSP: 0018:ffffabd78054bce0 EFLAGS: 00010082
<4> [62.231695] RAX: 000000000000000e RBX: ffff9fad95738008 RCX: 0000000000000000
<4> [62.231708] RDX: 0000000000000001 RSI: 0000000000000008 RDI: 0000000000000cb0
<4> [62.231725] RBP: ffffabd78054bd40 R08: 0000000000000000 R09: 0000000000000cb0
<4> [62.231738] R10: 00000000ab13c364 R11: ffff9fad9e897a38 R12: ffff9fad70b379c0
<4> [62.231751] R13: ffff9fad8c10be70 R14: ffff9fad70b379c0 R15: ffff9fad8db23140
<4> [62.231790] FS:  00007f41465ea300(0000) GS:ffff9fada0400000(0000) knlGS:0000000000000000
<4> [62.231801] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [62.231809] CR2: 00007fff731baca8 CR3: 0000000462a2a002 CR4: 0000000000760ef0
<4> [62.231819] PKRU: 55555554
<4> [62.231825] Call Trace:
<4> [62.231883]  execlists_cancel_requests+0x57/0x3d0 [i915]
<4> [62.231940]  __intel_gt_set_wedged.part.9+0xb2/0x180 [i915]
<4> [62.231952]  ? __drm_printfn_info+0x20/0x20
<4> [62.232005]  intel_gt_set_wedged+0x64/0x70 [i915]
<4> [62.232055]  i915_drop_caches_set+0x151/0x2f0 [i915]
<4> [62.232067]  simple_attr_write+0xb0/0xd0
<4> [62.232077]  full_proxy_write+0x51/0x80
<4> [62.232087]  vfs_write+0xbd/0x1d0
<4> [62.232094]  ksys_write+0x8f/0xe0
<4> [62.232103]  do_syscall_64+0x55/0x1c0
<4> [62.232112]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [62.232120] RIP: 0033:0x7f4145d69281
<4> [62.232127] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4> [62.232149] RSP: 002b:00007fff731bda58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4> [62.232161] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4145d69281
<4> [62.232172] RDX: 0000000000000005 RSI: 00007fff731bdab0 RDI: 0000000000000007
<4> [62.232182] RBP: 0000000000000005 R08: 0000000000000000 R09: 0000000000000000
<4> [62.232192] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff731bdab0
<4> [62.232202] R13: 0000000000000007 R14: 00007fff731bdab0 R15: 00007f4145d53d80
<4> [62.232216] Modules linked in: vgem mei_hdcp i915 ax88179_178a x86_pkg_temp_thermal usbnet coretemp mii crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mei_me mei prime_numbers
Comment 1 CI Bug Log 2019-09-11 05:37:20 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* TGL: igt@gem_vm_create@isolation - dmesg-warn - GEM_BUG_ON(!intel_context_is_pinned(ce))
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_363/fi-tgl-u/igt@gem_vm_create@isolation.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt@gem_vm_create@isolation.html

* TGL: igt@runner@aborted - fail - Previous test: gem_vm_create (isolation)
  (No new failures associated)
Comment 2 Chris Wilson 2019-09-11 07:51:32 UTC
Hmm, it looks like we dropped a pin on the rcs0->kernel_context. Bad, very bad.
Comment 4 Chris Wilson 2019-09-18 07:29:36 UTC
(In reply to Sudeep Dutt from comment #3)
> Test is passing @
> 
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_371/fi-tgl-u/
> igt%40gem_vm_create%40isolation.html

It's not a causal link, the bug is unrelated to the test. It just happened to show up as we cleaned up the tgl hangs. My presumption is that we hit an error path and tried to clean up a context twice (the most fragile link there is the active pin).


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.