Bug 100377 - [EXT][BSW] IGT pm_rc6_residency@media-rc6-accuracy soft hang
Summary: [EXT][BSW] IGT pm_rc6_residency@media-rc6-accuracy soft hang
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-24 13:36 UTC by Tomi Sarvela
Modified: 2017-07-25 12:45 UTC (History)
1 user (show)

See Also:
i915 platform: BSW/CHT
i915 features: GPU hang


Attachments

Description Tomi Sarvela 2017-03-24 13:36:26 UTC
Todays DRM-Tip build, kconfig/debug (or close to it):
https://intel-gfx-ci.01.org/CI/CI_DRM_2395/kernel.config.bz2

Dmesg courtesy of pstore.

IGT was running fast-feedback + extended lists in special order (GEMs last).

Tomi



<6>[ 5413.934513] Console: switching to colour frame buffer device 240x67
<6>[ 5414.266274] Console: switching to colour dummy device 80x25
<14>[ 5414.267793] [IGT] pm_rc6_residency: executing
<0>[ 5452.476159] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [thermald:478]
<4>[ 5452.476232] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep intel_powerclamp coretemp crct10dif_pclmul snd_hda_core crc32_pclmul ghash_clmulni_intel snd_pcm r8169 mii lpc_ich i2c_designware_pci sdhci_pci prime_numbers i2c_hid i2c_designware_platform i2c_designware_core [last unloaded: i915]
<4>[ 5452.476369] irq event stamp: 561500
<4>[ 5452.476381] hardirqs last  enabled at (561499): [<ffffffff81892b73>] restore_regs_and_iret+0x0/0x1d
<4>[ 5452.476389] hardirqs last disabled at (561500): [<ffffffff81892dfb>] apic_timer_interrupt+0x8b/0xa0
<4>[ 5452.476397] softirqs last  enabled at (506300): [<ffffffff81085c79>] __do_softirq+0x1d9/0x4c0
<4>[ 5452.476404] softirqs last disabled at (506283): [<ffffffff810860d9>] irq_exit+0xa9/0xc0
<4>[ 5452.476412] CPU: 1 PID: 478 Comm: thermald Tainted: G     U  W       4.11.0-rc3-CI-CI_DRM_317+ #1
<4>[ 5452.476418] Hardware name:                  /NUC5CPYB, BIOS PYBSWCEL.86A.0058.2016.1102.1842 11/02/2016
<4>[ 5452.476424] task: ffff88016a688040 task.stack: ffffc90000618000
<4>[ 5452.476432] RIP: 0010:smp_call_function_single+0x89/0x140
<4>[ 5452.476438] RSP: 0018:ffffc9000061bca0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
<4>[ 5452.476449] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000006
<4>[ 5452.476454] RDX: 0000000000000001 RSI: ffffffff81ca74d3 RDI: ffffffff81c82cb8
<4>[ 5452.476460] RBP: ffffc9000061bce0 R08: 0000000000000000 R09: 0000000000000001
<4>[ 5452.476465] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff814a0e00
<4>[ 5452.476471] R13: ffffc9000061bcf0 R14: 0000000000000001 R15: 0000000000000001
<4>[ 5452.476477] FS:  00007fb07ae97700(0000) GS:ffff88017fd00000(0000) knlGS:0000000000000000
<4>[ 5452.476483] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 5452.476488] CR2: 00007ff16d9a62b8 CR3: 0000000172cf9000 CR4: 00000000001006e0
<4>[ 5452.476494] Call Trace:
<4>[ 5452.476505]  ? wrmsr_safe_regs_on_cpu+0x30/0x30
<4>[ 5452.476516]  rdmsr_on_cpu+0x49/0x60
<4>[ 5452.476530]  show_temp+0x86/0xb0 [coretemp]
<4>[ 5452.476542]  dev_attr_show+0x1b/0x50
<4>[ 5452.476549]  ? sysfs_file_ops+0x41/0x60
<4>[ 5452.476557]  sysfs_kf_seq_show+0xbc/0x110
<4>[ 5452.476568]  kernfs_seq_show+0x22/0x30
<4>[ 5452.476576]  seq_read+0xf2/0x3d0
<4>[ 5452.476589]  kernfs_fop_read+0x124/0x1a0
<4>[ 5452.476601]  __vfs_read+0x23/0x110
<4>[ 5452.476612]  ? __fget+0x108/0x200
<4>[ 5452.476619]  ? expand_files+0x2b0/0x2b0
<4>[ 5452.476629]  vfs_read+0xa0/0x170
<4>[ 5452.476638]  SyS_read+0x44/0xb0
<4>[ 5452.476650]  entry_SYSCALL_64_fastpath+0x1c/0xb1
<4>[ 5452.476656] RIP: 0033:0x7fb07f08b51d
<4>[ 5452.476661] RSP: 002b:00007fb07ae956d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
<4>[ 5452.476672] RAX: ffffffffffffffda RBX: ffffffff8147ed53 RCX: 00007fb07f08b51d
<4>[ 5452.476678] RDX: 0000000000001fff RSI: 00007fb074000c90 RDI: 0000000000000009
<4>[ 5452.476683] RBP: ffffc9000061bf88 R08: 00007fb0740000d8 R09: 0000000000000000
<4>[ 5452.476689] R10: 00007fb074000078 R11: 0000000000000293 R12: 00007fb074000930
<4>[ 5452.476694] R13: 00007fb07ae958c8 R14: 00007fb074000c90 R15: 00007fb07ae95860
<4>[ 5452.476705]  ? __this_cpu_preempt_check+0x13/0x20
<4>[ 5452.476719] Code: c0 74 07 9c 58 f6 c4 02 74 4d 45 85 f6 74 75 48 8d 75 c0 89 df 4c 89 e9 4c 89 e2 e8 62 fe ff ff 89 c3 8b 45 d8 a8 01 74 0a f3 90 <8b> 55 d8 83 e2 01 75 f6 bf 01 00 00 00 e8 35 cd f8 ff 65 8b 05 
<0>[ 5452.477040] Kernel panic - not syncing: softlockup: hung tasks
<4>[ 5452.477093] CPU: 1 PID: 478 Comm: thermald Tainted: G     U  W    L  4.11.0-rc3-CI-CI_DRM_317+ #1
<4>[ 5452.477165] Hardware name:                  /NUC5CPYB, BIOS PYBSWCEL.86A.0058.2016.1102.1842 11/02/2016
<4>[ 5452.477241] Call Trace:
<4>[ 5452.477267]  <IRQ>
<4>[ 5452.477292]  dump_stack+0x67/0x92
<4>[ 5452.477327]  panic+0xcf/0x205
<4>[ 5452.477366]  watchdog_timer_fn+0x28c/0x2a0
<4>[ 5452.477407]  ? __touch_watchdog+0x30/0x30
<4>[ 5452.477446]  __hrtimer_run_queues+0xf3/0x530
<4>[ 5452.477491]  hrtimer_interrupt+0xb9/0x210
<4>[ 5452.477531]  ? wrmsr_safe_regs_on_cpu+0x30/0x30
<4>[ 5452.477576]  local_apic_timer_interrupt+0x31/0x50
<4>[ 5452.477619]  smp_apic_timer_interrupt+0x33/0x50
<4>[ 5452.477662]  apic_timer_interrupt+0x90/0xa0
<4>[ 5452.477701] RIP: 0010:smp_call_function_single+0x89/0x140
<4>[ 5452.477747] RSP: 0018:ffffc9000061bca0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
<4>[ 5452.477818] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000006
<4>[ 5452.477879] RDX: 0000000000000001 RSI: ffffffff81ca74d3 RDI: ffffffff81c82cb8
<4>[ 5452.477940] RBP: ffffc9000061bce0 R08: 0000000000000000 R09: 0000000000000001
<4>[ 5452.478001] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff814a0e00
<4>[ 5452.478061] R13: ffffc9000061bcf0 R14: 0000000000000001 R15: 0000000000000001
<4>[ 5452.478123]  </IRQ>
<4>[ 5452.478151]  ? wrmsr_safe_regs_on_cpu+0x30/0x30
<4>[ 5452.478205]  ? wrmsr_safe_regs_on_cpu+0x30/0x30
<4>[ 5452.478253]  rdmsr_on_cpu+0x49/0x60
<4>[ 5452.478295]  show_temp+0x86/0xb0 [coretemp]
<4>[ 5452.478340]  dev_attr_show+0x1b/0x50
<4>[ 5452.478377]  ? sysfs_file_ops+0x41/0x60
<4>[ 5452.478417]  sysfs_kf_seq_show+0xbc/0x110
<4>[ 5452.478461]  kernfs_seq_show+0x22/0x30
<4>[ 5452.478500]  seq_read+0xf2/0x3d0
<4>[ 5452.478541]  kernfs_fop_read+0x124/0x1a0
<4>[ 5452.478584]  __vfs_read+0x23/0x110
<4>[ 5452.478624]  ? __fget+0x108/0x200
<4>[ 5452.478659]  ? expand_files+0x2b0/0x2b0
<4>[ 5452.478701]  vfs_read+0xa0/0x170
<4>[ 5452.478739]  SyS_read+0x44/0xb0
<4>[ 5452.478777]  entry_SYSCALL_64_fastpath+0x1c/0xb1
<4>[ 5452.478821] RIP: 0033:0x7fb07f08b51d
<4>[ 5452.478857] RSP: 002b:00007fb07ae956d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
<4>[ 5452.478929] RAX: ffffffffffffffda RBX: ffffffff8147ed53 RCX: 00007fb07f08b51d
<4>[ 5452.478990] RDX: 0000000000001fff RSI: 00007fb074000c90 RDI: 0000000000000009
<4>[ 5452.479051] RBP: ffffc9000061bf88 R08: 00007fb0740000d8 R09: 0000000000000000
<4>[ 5452.479111] R10: 00007fb074000078 R11: 0000000000000293 R12: 00007fb074000930
<4>[ 5452.479172] R13: 00007fb07ae958c8 R14: 00007fb074000c90 R15: 00007fb07ae95860
<4>[ 5452.479237]  ? __this_cpu_preempt_check+0x13/0x20
<0>[ 5453.596897] Shutting down cpus with NMI
<0>[ 5453.596949] Kernel Offset: disabled
Comment 1 Elizabeth 2017-06-23 21:53:51 UTC
Hello Tomi, 
Is there any update in this case, any extra information, logs, HW or SW that had changed? Thank you.
Comment 2 Ricardo 2017-07-25 12:45:24 UTC
this test was executed this week and is now passing, closing bug as fixed


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.