Bug 108343 - [CI][DRMTIP] igt@kms_busy@extended-pageflip-hang-newfb-render-f - incomplete - GEM_BUG_ON(!intel_engine_is_idle(engine))
Summary: [CI][DRMTIP] igt@kms_busy@extended-pageflip-hang-newfb-render-f - incomplete ...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 108367 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-10-12 14:32 UTC by Martin Peres
Modified: 2018-12-04 20:46 UTC (History)
1 user (show)

See Also:
i915 platform: BDW, ICL
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2018-10-12 14:32:05 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_125/fi-icl-u2/igt@kms_busy@extended-pageflip-hang-newfb-render-f.html

<3> [64.034928] reset_all_global_seqno:147 GEM_BUG_ON(!intel_engine_is_idle(engine))
<4> [64.035044] ------------[ cut here ]------------
<2> [64.035046] kernel BUG at drivers/gpu/drm/i915/i915_request.c:147!
<4> [64.035054] invalid opcode: 0000 [#1] PREEMPT SMP PTI
<4> [64.035058] CPU: 3 PID: 949 Comm: kms_busy Tainted: G     U  W         4.19.0-rc7-g6c3870cc0454-drmtip_125+ #1
<4> [64.035061] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.2352.A01.1808281852 08/28/2018
<4> [64.035096] RIP: 0010:reset_all_global_seqno.part.5+0x1c5/0x260 [i915]
<4> [64.035099] Code: 7a 99 d3 c8 48 8b 35 7a ff 1b 00 49 c7 c0 cf d2 4c c0 b9 93 00 00 00 48 c7 c2 e0 38 4b c0 48 c7 c7 a0 44 3c c0 e8 9b 28 da c8 <0f> 0b 48 c7 c1 c0 73 4e c0 ba 94 00 00 00 48 c7 c6 e0 38 4b c0 48
<4> [64.035101] RSP: 0018:ffff9e15c09fbd78 EFLAGS: 00010286
<4> [64.035104] RAX: 000000000000000f RBX: ffff8a96d1aca158 RCX: 0000000000000000
<4> [64.035106] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff8a96eea40ff8
<4> [64.035108] RBP: ffff8a96d57477c8 R08: 000000000000df37 R09: ffff8a96eebf8000
<4> [64.035110] R10: 0000000000000000 R11: ffff8a96eea40ff8 R12: 0000000000000000
<4> [64.035112] R13: ffff8a96d5740000 R14: ffff8a96d57477e8 R15: ffffffffc03c435c
<4> [64.035114] FS:  00007fc5b105b980(0000) GS:ffff8a96f0780000(0000) knlGS:0000000000000000
<4> [64.035117] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [64.035119] CR2: 000055e88b412cb8 CR3: 00000004a86ba006 CR4: 0000000000760ee0
<4> [64.035121] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4> [64.035122] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4> [64.035124] PKRU: 55555554
<4> [64.035126] Call Trace:
<4> [64.035154]  i915_next_seqno_set+0x33/0x60 [i915]
<4> [64.035160]  simple_attr_write+0xb0/0xd0
<4> [64.035165]  full_proxy_write+0x51/0x80
<4> [64.035169]  __vfs_write+0x31/0x180
<4> [64.035172]  ? rcu_lockdep_current_cpu_online+0x8f/0xd0
<4> [64.035175]  ? rcu_read_lock_sched_held+0x6f/0x80
<4> [64.035178]  ? rcu_sync_lockdep_assert+0x29/0x50
<4> [64.035180]  ? __sb_start_write+0x152/0x1f0
<4> [64.035183]  ? __sb_start_write+0x168/0x1f0
<4> [64.035186]  vfs_write+0xbd/0x1b0
<4> [64.035189]  ksys_write+0x50/0xc0
<4> [64.035193]  do_syscall_64+0x55/0x190
<4> [64.035197]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [64.035200] RIP: 0033:0x7fc5b07db281
<4> [64.035202] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4> [64.035204] RSP: 002b:00007ffeab63d1d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4> [64.035207] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc5b07db281
<4> [64.035209] RDX: 0000000000000001 RSI: 00007fc5b0c5769a RDI: 0000000000000009
<4> [64.035211] RBP: 00007ffeab63d200 R08: 0000000000000000 R09: 0000000000000022
<4> [64.035213] R10: 0000000000000000 R11: 0000000000000246 R12: 000055e88b1da930
<4> [64.035215] R13: 00007ffeab63df20 R14: 0000000000000000 R15: 0000000000000000
<4> [64.035219] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep btusb btrtl snd_hda_core btbcm btintel cdc_ether e1000e usbnet snd_pcm mii bluetooth ecdh_generic prime_numbers
<0> [64.035250] Dumping ftrace buffer:
<0> [64.035252] ---------------------------------
Comment 1 Chris Wilson 2018-10-15 11:11:00 UTC
*** Bug 108367 has been marked as a duplicate of this bug. ***
Comment 2 Chris Wilson 2018-10-15 11:12:10 UTC
Notably, on both it is an idle vecs that explodes. Very, very suspicious hw behaviour.
Comment 3 Chris Wilson 2018-10-15 19:35:23 UTC
Possibly?

commit 9d3eb2c33f03432a25a6a3ab3177f839f25cbaf5 (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Oct 15 12:58:56 2018 +0100

    drm/i915: Hold rpm wakeref for debugfs/i915_drop_caches_set
    
    Since we peek into HW state and poke around, it behoves us to acquire a
    runtime pm wakeref beforehand.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=108343
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108364
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20181015115856.18590-1-chris@chris-wilson.co.uk
Comment 4 Lakshmi 2018-10-23 13:13:43 UTC
Update: Last seen this issue drmtip_125 (1 week, 4 days / 117 runs ago). Need to wait for 2 more weeks to close this bug and occurred only once.
Comment 5 Martin Peres 2018-11-12 09:40:49 UTC
Still seen in BAT...

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5117/fi-icl-u2/igt@gem_exec_reloc@basic-write-read.html

<3> [115.788834] reset_all_global_seqno:149 GEM_BUG_ON(!intel_engine_is_idle(engine))
<4> [115.788983] ------------[ cut here ]------------
<2> [115.788986] kernel BUG at drivers/gpu/drm/i915/i915_request.c:149!
<4> [115.789012] invalid opcode: 0000 [#1] PREEMPT SMP PTI
<4> [115.789021] CPU: 2 PID: 2181 Comm: gem_exec_reloc Tainted: G     U  W         4.20.0-rc1-CI-CI_DRM_5117+ #1
<4> [115.789030] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.2402.AD3.1810170014 10/17/2018
<4> [115.789083] RIP: 0010:reset_all_global_seqno.part.5+0x1d3/0x220 [i915]
<4> [115.789091] Code: ec a6 e2 e0 48 8b 35 64 45 1c 00 49 c7 c0 e4 d1 3d a0 b9 95 00 00 00 48 c7 c2 80 35 3c a0 48 c7 c7 9e 1a 2d a0 e8 3d 2d e9 e0 <0f> 0b 48 c7 c1 20 75 3f a0 ba 96 00 00 00 48 c7 c6 80 35 3c a0 48
<4> [115.789119] RSP: 0018:ffffc90002733d70 EFLAGS: 00010282
<4> [115.789129] RAX: 000000000000000f RBX: ffff880494f62158 RCX: 0000000000000000
<4> [115.789140] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff8804ae2574e8
<4> [115.789153] RBP: ffff8804946877c0 R08: 000000000008a268 R09: ffff8804ae396000
<4> [115.789162] R10: 0000000000000000 R11: ffff8804ae2574e8 R12: 0000000000000000
<4> [115.789172] R13: ffff880494680000 R14: ffff8804946877e0 R15: ffffffffa02d193e
<4> [115.789183] FS:  00007f7408c10980(0000) GS:ffff8804aff00000(0000) knlGS:0000000000000000
<4> [115.789195] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [115.789204] CR2: 00007fffa6305e58 CR3: 0000000494d50005 CR4: 0000000000760ee0
<4> [115.789212] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4> [115.789219] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4> [115.789226] PKRU: 55555554
<4> [115.789230] Call Trace:
<4> [115.789274]  i915_drop_caches_set+0x261/0x270 [i915]
<4> [115.789286]  simple_attr_write+0xb0/0xd0
<4> [115.789297]  full_proxy_write+0x52/0x90
<4> [115.789307]  __vfs_write+0x31/0x180
<4> [115.789315]  ? rcu_read_lock_sched_held+0x6f/0x80
<4> [115.789321]  ? rcu_sync_lockdep_assert+0x29/0x50
<4> [115.789328]  ? __sb_start_write+0x152/0x1f0
<4> [115.789334]  ? __sb_start_write+0x163/0x1f0
<4> [115.789340]  vfs_write+0xbd/0x1b0
<4> [115.789346]  ksys_write+0x50/0xc0
<4> [115.789353]  do_syscall_64+0x55/0x190
<4> [115.789360]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [115.789367] RIP: 0033:0x7f74085a0281
<4> [115.789372] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4> [115.789389] RSP: 002b:00007fffa6309178 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4> [115.789397] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f74085a0281
<4> [115.789404] RDX: 0000000000000005 RSI: 00007fffa6309200 RDI: 0000000000000007
<4> [115.789411] RBP: 00007fffa63091a0 R08: 0000000000000000 R09: 0000000000000000
<4> [115.789418] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f7408589718
<4> [115.789425] R13: 0000000000000003 R14: 00007f740858e628 R15: 00007f740858ad80
<4> [115.789435] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul btusb ghash_clmulni_intel snd_hda_intel btrtl snd_hda_codec btbcm btintel snd_hwdep snd_hda_core snd_pcm bluetooth e1000e cdc_ether usbnet mii ecdh_generic prime_numbers
Comment 6 Martin Peres 2018-11-13 14:43:28 UTC
Also seen on BDW: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_139/fi-bdw-gvtdvm/igt@gem_exec_whisper@normal.html

<3> [55.538979] reset_all_global_seqno:149 GEM_BUG_ON(!intel_engine_is_idle(engine))
<4> [55.539092] ------------[ cut here ]------------
<2> [55.539095] kernel BUG at drivers/gpu/drm/i915/i915_request.c:149!
<4> [55.539115] invalid opcode: 0000 [#1] PREEMPT SMP PTI
<4> [55.539122] CPU: 0 PID: 986 Comm: gem_exec_whispe Tainted: G     U            4.20.0-rc1-gb3838255012c-drmtip_139+ #1
<4> [55.539132] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
<4> [55.539184] RIP: 0010:reset_all_global_seqno.part.5+0x1d3/0x220 [i915]
<4> [55.539192] Code: ec e1 c8 ea 48 8b 35 4c 46 1c 00 49 c7 c0 e4 e1 57 c0 b9 95 00 00 00 48 c7 c2 80 45 56 c0 48 c7 c7 de 29 47 c0 e8 4d 68 cf ea <0f> 0b 48 c7 c1 18 85 59 c0 ba 96 00 00 00 48 c7 c6 80 45 56 c0 48
<4> [55.539208] RSP: 0018:ffff9acb80a1fa60 EFLAGS: 00010286
<4> [55.539214] RAX: 000000000000000f RBX: ffff97146f460008 RCX: 0000000000000000
<4> [55.539221] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff97147d427a78
<4> [55.539229] RBP: ffff97146f4077a0 R08: 00000000004d6045 R09: ffff97147d42c000
<4> [55.539236] R10: 0000000000000000 R11: ffff97147d427a78 R12: 0000000000000000
<4> [55.539243] R13: ffff97146f400000 R14: ffff97146f4077d8 R15: ffffffffc047287e
<4> [55.539251] FS:  00007f9894a49980(0000) GS:ffff97147da00000(0000) knlGS:0000000000000000
<4> [55.539259] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [55.539265] CR2: 000055a482f11018 CR3: 0000000079a38001 CR4: 00000000003606f0
<4> [55.539274] Call Trace:
<4> [55.539316]  i915_request_alloc+0x4a6/0x7e0 [i915]
<4> [55.539355]  i915_gem_do_execbuffer+0x71a/0x1580 [i915]
<4> [55.539364]  ? deactivate_slab.isra.26+0x74b/0x7a0
<4> [55.539374]  ? ___slab_alloc.constprop.34+0x21c/0x380
<4> [55.539381]  ? ___slab_alloc.constprop.34+0x21c/0x380
<4> [55.539416]  ? i915_gem_execbuffer2_ioctl+0xc4/0x3f0 [i915]
<4> [55.539426]  ? lock_acquire+0xa6/0x1c0
<4> [55.539433]  ? __might_fault+0x38/0x90
<4> [55.539468]  ? i915_gem_execbuffer_ioctl+0x300/0x300 [i915]
<4> [55.539502]  i915_gem_execbuffer2_ioctl+0x21b/0x3f0 [i915]
<4> [55.539538]  ? i915_gem_execbuffer_ioctl+0x300/0x300 [i915]
<4> [55.539547]  drm_ioctl_kernel+0x81/0xf0
<4> [55.539554]  drm_ioctl+0x2de/0x390
<4> [55.539586]  ? i915_gem_execbuffer_ioctl+0x300/0x300 [i915]
<4> [55.539595]  ? _raw_spin_unlock_irq+0x24/0x50
<4> [55.539602]  ? lockdep_hardirqs_on+0xe0/0x1b0
<4> [55.539609]  do_vfs_ioctl+0xa0/0x6e0
<4> [55.539616]  ? __schedule+0x36c/0xb50
<4> [55.539622]  ksys_ioctl+0x35/0x60
<4> [55.539628]  __x64_sys_ioctl+0x11/0x20
<4> [55.539634]  do_syscall_64+0x55/0x190
<4> [55.539640]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [55.539646] RIP: 0033:0x7f98940ef5d7
<4> [55.539651] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4> [55.539668] RSP: 002b:00007ffefad6ea98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4> [55.539676] RAX: ffffffffffffffda RBX: 00007ffefad85080 RCX: 00007f98940ef5d7
<4> [55.539684] RDX: 00007ffefad6ec90 RSI: 0000000040406469 RDI: 0000000000000005
<4> [55.539691] RBP: 00007ffefad6ec90 R08: 00007f98943c4230 R09: 00007f98943c4240
<4> [55.539698] R10: 00000000ffffffe2 R11: 0000000000000246 R12: 0000000040406469
<4> [55.539705] R13: 0000000000000005 R14: 0000000000000000 R15: 0000000000000000
<4> [55.539715] Modules linked in: i915 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000 prime_numbers i2c_piix4
Comment 7 Martin Peres 2018-11-13 14:46:43 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_139/fi-icl-u2/igt@gem_eio@in-flight-internal-1us.html

<4> [361.723249] ------------[ cut here ]------------
<2> [361.723252] kernel BUG at drivers/gpu/drm/i915/i915_request.c:149!
<4> [361.723275] invalid opcode: 0000 [#1] PREEMPT SMP PTI
<4> [361.723283] CPU: 1 PID: 1120 Comm: gem_eio Tainted: G     U  W         4.20.0-rc1-gb3838255012c-drmtip_139+ #1
<4> [361.723297] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.2402.AD3.1810170014 10/17/2018
<4> [361.723350] RIP: 0010:reset_all_global_seqno.part.5+0x1d3/0x220 [i915]
<4> [361.723358] Code: ec c1 e4 ca 48 8b 35 4c 46 1c 00 49 c7 c0 e4 01 3c c0 b9 95 00 00 00 48 c7 c2 80 65 3a c0 48 c7 c7 de 49 2b c0 e8 4d 48 eb ca <0f> 0b 48 c7 c1 18 a5 3d c0 ba 96 00 00 00 48 c7 c6 80 65 3a c0 48
<4> [361.723375] RSP: 0018:ffffb897802abd78 EFLAGS: 00010286
<4> [361.723382] RAX: 000000000000000f RBX: ffff948154e82158 RCX: 0000000000000000
<4> [361.723389] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff94816e257a38
<4> [361.723397] RBP: ffff948154e977b8 R08: 000000000001436c R09: ffff94816e37c000
<4> [361.723404] R10: 0000000000000000 R11: ffff94816e257a38 R12: 0000000000000000
<4> [361.723411] R13: ffff948154e90000 R14: ffff948154e977d8 R15: ffffffffc02b487e
<4> [361.723419] FS:  00007f19f6c1c980(0000) GS:ffff94816fe80000(0000) knlGS:0000000000000000
<4> [361.723428] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [361.723435] CR2: 00007ffcb09a0ff8 CR3: 00000004a7974003 CR4: 0000000000760ee0
<4> [361.723442] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4> [361.723450] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4> [361.723457] PKRU: 55555554
<4> [361.723461] Call Trace:
<4> [361.723492]  i915_next_seqno_set+0x33/0x60 [i915]
<4> [361.723502]  simple_attr_write+0xb0/0xd0
<4> [361.723510]  full_proxy_write+0x52/0x90
<4> [361.723517]  __vfs_write+0x31/0x180
<4> [361.723524]  ? rcu_read_lock_sched_held+0x6f/0x80
<4> [361.723530]  ? rcu_sync_lockdep_assert+0x29/0x50
<4> [361.723537]  ? __sb_start_write+0x152/0x1f0
<4> [361.723543]  ? __sb_start_write+0x163/0x1f0
<4> [361.723550]  vfs_write+0xbd/0x1b0
<4> [361.723556]  ksys_write+0x50/0xc0
<4> [361.723563]  do_syscall_64+0x55/0x190
<4> [361.723570]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [361.723577] RIP: 0033:0x7f19f6193281
<4> [361.723582] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4> [361.723600] RSP: 002b:00007ffcb09a29b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4> [361.723608] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f19f6193281
<4> [361.723616] RDX: 0000000000000001 RSI: 00007f19f6817d7a RDI: 0000000000000009
<4> [361.723623] RBP: 00007ffcb09a29e0 R08: 0000000000000000 R09: 0000000000000022
<4> [361.723631] R10: 0000000000000000 R11: 0000000000000246 R12: 000055d692f45c50
<4> [361.723638] R13: 00007ffcb09a32e0 R14: 0000000000000000 R15: 0000000000000000
<4> [361.723648] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec btusb btrtl btbcm btintel snd_hwdep e1000e snd_hda_core bluetooth snd_pcm cdc_ether usbnet mii ecdh_generic prime_numbers
Comment 8 Chris Wilson 2018-12-04 20:46:10 UTC
Bug was unfortunately hijacked by a separate issue.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.