Bug 102970

Summary: [CI][HSW] - igt@gem_busy@hang-render - SW HANG - Incomplete
Product: DRI Reporter: Marta Löfstedt <marta.lofstedt>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: high CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: HSW i915 features: GEM/Other

Description Marta Löfstedt 2017-09-25 12:06:37 UTC
On CI_DRM_3126 the new IGT tests:
 igt@gem_busy@hang-render

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3126/shard-hsw4/igt@gem_busy@hang-render.html

<7>[  310.500327] [drm:intelfb_create [i915]] no BIOS fb, allocating a new one
<3>[  311.481866] Failed to start request 14
<0>[  340.558111] watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [swapper/6:0]
<4>[  340.558137] Modules linked in: i915(+) snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek snd_hda_codec_generic coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core r8169 snd_pcm mii mei_me mei lpc_ich prime_numbers [last unloaded: i915]
<4>[  340.558210] irq event stamp: 15290295
<4>[  340.558217] hardirqs last  enabled at (15290294): [<ffffffff819107bd>] restore_regs_and_iret+0x0/0x1d
<4>[  340.558221] hardirqs last disabled at (15290295): [<ffffffff819117e5>] apic_timer_interrupt+0x95/0xa0
<4>[  340.558226] softirqs last  enabled at (11765066): [<ffffffff81085251>] _local_bh_enable+0x21/0x40
<4>[  340.558230] softirqs last disabled at (11765067): [<ffffffff81085645>] irq_exit+0xb5/0xd0
<4>[  340.558235] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G     U          4.14.0-rc1-CI-CI_DRM_3126+ #1
<4>[  340.558238] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
<4>[  340.558242] task: ffff88040d5aa9c0 task.stack: ffffc900000bc000
<4>[  340.558246] RIP: 0010:__do_softirq+0xa3/0x4e2
<4>[  340.558249] RSP: 0018:ffff88041fb83f58 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
<4>[  340.558256] RAX: 00000000ffffffff RBX: ffff88040d5aa9c0 RCX: 0000000000000000
<4>[  340.558259] RDX: 0000000000000000 RSI: ffffffff81d0ddbc RDI: ffffffff81cc1bee
<4>[  340.558263] RBP: ffff88041fb83fb8 R08: 0000000000000000 R09: 0000000000000000
<4>[  340.558266] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4>[  340.558270] R13: 0000000000000005 R14: ffffe8ffffd89040 R15: 00000048e1c0c0c1
<4>[  340.558273] FS:  0000000000000000(0000) GS:ffff88041fb80000(0000) knlGS:0000000000000000
<4>[  340.558277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  340.558280] CR2: 000055da8e5edf88 CR3: 00000004053db000 CR4: 00000000001606e0
<4>[  340.558284] Call Trace:
<4>[  340.558287]  <IRQ>
<4>[  340.558294]  irq_exit+0xb5/0xd0
<4>[  340.558298]  smp_apic_timer_interrupt+0x9e/0x2e0
<4>[  340.558303]  apic_timer_interrupt+0x9a/0xa0
<4>[  340.558306]  </IRQ>
<4>[  340.558312] RIP: 0010:cpuidle_enter_state+0x136/0x370
<4>[  340.558315] RSP: 0018:ffffc900000bfe80 EFLAGS: 00000216 ORIG_RAX: ffffffffffffff10
<4>[  340.558322] RAX: ffff88040d5aa9c0 RBX: 000000000020fa0f RCX: 0000000000000001
<4>[  340.558325] RDX: 0000000000000000 RSI: ffffffff81d0ddbc RDI: ffffffff81cc1bee
<4>[  340.558329] RBP: ffffc900000bfeb8 R08: 0000000000000ece R09: 0000000000000018
<4>[  340.558332] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000005
<4>[  340.558336] R13: 0000000000000005 R14: ffffe8ffffd89040 R15: 00000048e1c0c0c1
<4>[  340.558346]  cpuidle_enter+0x17/0x20
<4>[  340.558351]  call_cpuidle+0x23/0x40
<4>[  340.558355]  do_idle+0x192/0x1e0
<4>[  340.558361]  cpu_startup_entry+0x1d/0x20
<4>[  340.558365]  start_secondary+0x11c/0x140
<4>[  340.558370]  secondary_startup_64+0xa5/0xa5
<4>[  340.558378] Code: 00 00 e8 11 ac 7c ff c7 45 c8 0a 00 00 00 48 89 5d a8 48 c7 c0 40 86 01 00 65 c7 00 00 00 00 00 e8 23 76 7c ff fb b8 ff ff ff ff <48> c7 45 c0 00 51 e0 81 0f bc 45 d4 83 c0 01 89 45 d0 75 6a e9 
<0>[  340.558582] Kernel panic - not syncing: softlockup: hung tasks
<4>[  340.558602] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G     U       L  4.14.0-rc1-CI-CI_DRM_3126+ #1
<4>[  340.558628] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
<4>[  340.558651] Call Trace:
<4>[  340.558661]  <IRQ>
<4>[  340.558672]  dump_stack+0x68/0x9f
<4>[  340.558686]  panic+0xd4/0x21d
<4>[  340.558702]  watchdog_timer_fn+0x289/0x290
<4>[  340.558719]  __hrtimer_run_queues+0xed/0x4d0
<4>[  340.558735]  ? __touch_watchdog+0x30/0x30
<4>[  340.558751]  hrtimer_interrupt+0xc1/0x220
<4>[  340.558768]  smp_apic_timer_interrupt+0x7d/0x2e0
<4>[  340.558784]  apic_timer_interrupt+0x9a/0xa0
<4>[  340.558799] RIP: 0010:__do_softirq+0xa3/0x4e2
<4>[  340.558814] RSP: 0018:ffff88041fb83f58 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
<4>[  340.558841] RAX: 00000000ffffffff RBX: ffff88040d5aa9c0 RCX: 0000000000000000
<4>[  340.558863] RDX: 0000000000000000 RSI: ffffffff81d0ddbc RDI: ffffffff81cc1bee
<4>[  340.558884] RBP: ffff88041fb83fb8 R08: 0000000000000000 R09: 0000000000000000
<4>[  340.558906] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4>[  340.558928] R13: 0000000000000005 R14: ffffe8ffffd89040 R15: 00000048e1c0c0c1
<4>[  340.558954]  ? __do_softirq+0x9d/0x4e2
<4>[  340.558971]  irq_exit+0xb5/0xd0
<4>[  340.558984]  smp_apic_timer_interrupt+0x9e/0x2e0
<4>[  340.559000]  apic_timer_interrupt+0x9a/0xa0
<4>[  340.559015]  </IRQ>
<4>[  340.559026] RIP: 0010:cpuidle_enter_state+0x136/0x370
<4>[  340.559042] RSP: 0018:ffffc900000bfe80 EFLAGS: 00000216 ORIG_RAX: ffffffffffffff10
<4>[  340.559069] RAX: ffff88040d5aa9c0 RBX: 000000000020fa0f RCX: 0000000000000001
<4>[  340.559091] RDX: 0000000000000000 RSI: ffffffff81d0ddbc RDI: ffffffff81cc1bee
<4>[  340.559113] RBP: ffffc900000bfeb8 R08: 0000000000000ece R09: 0000000000000018
<4>[  340.559134] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000005
<4>[  340.559156] R13: 0000000000000005 R14: ffffe8ffffd89040 R15: 00000048e1c0c0c1
<4>[  340.559184]  cpuidle_enter+0x17/0x20
<4>[  340.559198]  call_cpuidle+0x23/0x40
<4>[  340.559212]  do_idle+0x192/0x1e0
<4>[  340.559227]  cpu_startup_entry+0x1d/0x20
<4>[  340.559242]  start_secondary+0x11c/0x140
<4>[  340.559257]  secondary_startup_64+0xa5/0xa5
<0>[  340.559519] Kernel Offset: disabled
Comment 1 Marta Löfstedt 2017-09-25 13:16:32 UTC
<marta_> Adrinael, you mentioned something about being wrong testlist for shards on CI_DRM_3026, could you elaborate. I have already filed bugs for this run...
<Adrinael> CI_DRM_3126
<Adrinael> It was running everything ever on accident
<Adrinael> ivyl, ^ right?
<marta_> but it was only 3 new drv_selftests and 3 new gem tests, for sure we have more than that blacklisted
<ivyl> yep, due to elaborated nature of deployment method, and streamilining it to use just "make install" an inevitable error occured on the human-Jenkins boundary.
<ivyl> marta_: it run with ALL ALL, but it got cancelled pretty quickly
<ivyl> and then rerun properly
<ivyl> what you see is the merge of both
<Adrinael> tools_test@* got "broken" by make install -deployment btw
* Weine (~dweineha@134.134.139.76) has joined
<ivyl> as jenkins haven't cleaned staging area for results
<Adrinael> marta_, if you file a bug on igt@tools_test@tools_test, make it an IGT bug
<ivyl> so sorry about confusion, it wasn't intended and I hoped the rerun will fix it
<ivyl> but as you can see we have the few leftovers
<marta_> OK, I will archive if needed when I results from the next run.
* Ahuj (Thunderbir@nat/intel/x-ngcmqbhsvahecvri) has joined
<ivyl> results from -27 already came in and they look clean, we also should have results for -28 in half an hour or so
<marta_> dolphin, disregard my comments on BUGS: 102970 and 102971 ^^. These test should not have been part of CI_DRM_3026, I will archive the bugs from cibuglog, but of course we will keep them in fdo.
Comment 2 Chris Wilson 2017-09-25 13:29:31 UTC
Nothing here indicts gem_busy.

*** This bug has been marked as a duplicate of bug 102973 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.