Bug 112000 - [CI][SHARDS] igt@gem_eio@kms - dmesg-warn - NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7b/0x120
Summary: [CI][SHARDS] igt@gem_eio@kms - dmesg-warn - NMI backtrace for cpu 1 skipped: ...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium minor
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-14 12:41 UTC by Lakshmi
Modified: 2019-11-04 19:35 UTC (History)
1 user (show)

See Also:
i915 platform: SNB
i915 features: display/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lakshmi 2019-10-14 12:41:36 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7062/shard-snb5/igt@gem_eio@kms.html
<3> [1164.891754] rcu: INFO: rcu_preempt self-detected stall on CPU
<3> [1164.891767] rcu: 	4-...!: (1 ticks this GP) idle=f0a/0/0x1 softirq=164703/164703 fqs=0 
<4> [1164.891780] 	(t=91152 jiffies g=294577 q=209)
<3> [1164.891788] rcu: rcu_preempt kthread starved for 91152 jiffies! g294577 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
<3> [1164.891803] rcu: RCU grace-period kthread stack dump:
<6> [1164.891811] rcu_preempt     I14368    10      2 0x80004000
<4> [1164.891823] Call Trace:
<4> [1164.891833]  ? __schedule+0x2ef/0x7f0
<4> [1164.891843]  schedule+0x34/0xc0
<4> [1164.891850]  schedule_timeout+0x1cc/0x3f0
<4> [1164.891862]  ? __next_timer_interrupt+0xc0/0xc0
<4> [1164.891872]  rcu_gp_kthread+0x604/0xb70
<4> [1164.891882]  ? rcu_barrier_func+0xa0/0xa0
<4> [1164.891891]  kthread+0x119/0x130
<4> [1164.891898]  ? kthread_park+0x80/0x80
<4> [1164.891907]  ret_from_fork+0x3a/0x50
<6> [1164.891921] Sending NMI from CPU 4 to CPUs 0:
<4> [1164.892008] NMI backtrace for cpu 0
<4> [1164.892009] CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G     U            5.4.0-rc2-CI-CI_DRM_7062+ #1
<4> [1164.892009] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
<4> [1164.892010] RIP: 0010:lock_acquire+0xa7/0x1c0
<4> [1164.892011] Code: 24 6a 00 45 89 f0 6a 00 ff 74 24 58 44 89 e9 41 57 44 89 e2 89 ee 49 c1 e9 09 48 89 df 49 83 f1 01 41 83 e1 01 e8 69 08 00 00 <65> 48 8b 04 25 00 5f 01 00 c7 80 8c 08 00 00 00 00 00 00 ff 74 24
<4> [1164.892012] RSP: 0018:ffffc90000057a00 EFLAGS: 00000082
<4> [1164.892013] RAX: 0000000000000000 RBX: ffff888226985a58 RCX: 0000000000000000
<4> [1164.892013] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
<4> [1164.892014] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
<4> [1164.892014] R10: 0000000000000000 R11: ffff88820ee58000 R12: 0000000000000000
<4> [1164.892015] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
<4> [1164.892016] FS:  0000000000000000(0000) GS:ffff888227800000(0000) knlGS:0000000000000000
<4> [1164.892016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [1164.892017] CR2: 00007faf9f696000 CR3: 0000000002210002 CR4: 00000000000606f0
<4> [1164.892017] Call Trace:
<4> [1164.892018]  ? free_debug_processing+0x37/0x380
<4> [1164.892018]  _raw_spin_lock_irqsave+0x33/0x50
<4> [1164.892019]  ? free_debug_processing+0x37/0x380
<4> [1164.892019]  free_debug_processing+0x37/0x380
<4> [1164.892020]  ? arp_process.constprop.24+0x18a/0x9c0
<4> [1164.892020]  __slab_free+0x35b/0x520
<4> [1164.892021]  ? debug_check_no_obj_freed+0x11d/0x210
<4> [1164.892021]  ? kmem_cache_free+0x31f/0x390
<4> [1164.892022]  ? arp_process.constprop.24+0x18a/0x9c0
<4> [1164.892023]  kmem_cache_free+0x31f/0x390
<4> [1164.892023]  arp_process.constprop.24+0x18a/0x9c0
<4> [1164.892024]  ? __netif_receive_skb_core+0x24a/0x9f0
<4> [1164.892024]  __netif_receive_skb_one_core+0x80/0x90
<4> [1164.892025]  netif_receive_skb_internal+0x69/0x1f0
<4> [1164.892025]  napi_gro_receive+0x1d3/0x280
<4> [1164.892026]  tg3_poll_work+0x976/0xf60 [tg3]
<4> [1164.892026]  ? scsi_end_request+0x162/0x360
<4> [1164.892027]  tg3_poll+0x6d/0x3a0 [tg3]
<4> [1164.892027]  net_rx_action+0x157/0x490
<4> [1164.892028]  __do_softirq+0xdf/0x47f
<4> [1164.892028]  ? smpboot_thread_fn+0x23/0x280
<4> [1164.892029]  ? smpboot_thread_fn+0x6b/0x280
<4> [1164.892029]  run_ksoftirqd+0x2b/0x50
<4> [1164.892030]  smpboot_thread_fn+0x1d3/0x280
<4> [1164.892030]  ? sort_range+0x20/0x20
<4> [1164.892031]  kthread+0x119/0x130
<4> [1164.892031]  ? kthread_park+0x80/0x80
<4> [1164.892032]  ret_from_fork+0x3a/0x50
<6> [1164.892932] Sending NMI from CPU 4 to CPUs 1:
<4> [1164.910889] NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7b/0x120
<6> [1164.910945] Sending NMI from CPU 4 to CPUs 2:
<4> [1164.911029] NMI backtrace for cpu 2
<4> [1164.911030] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G     U            5.4.0-rc2-CI-CI_DRM_7062+ #1
<4> [1164.911031] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
<4> [1164.911031] RIP: 0010:tick_sched_do_timer+0x24/0x50
<4> [1164.911032] Code: ff ff 0f 0b eb e0 55 53 48 89 f5 48 89 fb e8 f3 a6 39 00 8b 15 2d 4c 1e 01 83 fa ff 74 15 39 c2 74 17 0f b6 43 4c a8 01 74 06 <83> c8 10 88 43 4c 5b 5d c3 89 05 0d 4c 1e 01 48 89 e8 48 2b 05 7b
<4> [1164.911033] RSP: 0018:ffffc90000108ee0 EFLAGS: 00000002
<4> [1164.911034] RAX: 0000000000000001 RBX: ffff88822792d9c0 RCX: 0000000000000018
<4> [1164.911035] RDX: 0000000000000002 RSI: 0000010f392ed70f RDI: ffff88822792d9c0
<4> [1164.911035] RBP: 0000010f392ed70f R08: 0000000000000000 R09: 0000000000000000
<4> [1164.911036] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000007fdd8
<4> [1164.911036] R13: ffffffff82340a70 R14: ffff88822792d258 R15: ffffffff8115bf80
<4> [1164.911037] FS:  0000000000000000(0000) GS:ffff888227900000(0000) knlGS:0000000000000000
<4> [1164.911038] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [1164.911038] CR2: 00007f5cebe14000 CR3: 0000000002210002 CR4: 00000000000606e0
Comment 1 CI Bug Log 2019-10-14 12:43:30 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* SNB: igt@gem_eio@kms - dmesg-warn - NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7b/0x120
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7062/shard-snb5/igt@gem_eio@kms.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_5146/shard-snb6/igt@gem_eio@kms.html
Comment 2 Vanshidhar Konda 2019-11-04 19:35:05 UTC
This issue seems to be related to: https://bugs.freedesktop.org/show_bug.cgi?id=112001

In both these issues, the kernel has detected that the TSC is skewed on one of the CPUs and it switches from TSC as the clocksource to HPET. In both cases the issues occur on SNB platform and started occuring 3 weeks, 3 days ago. The issue noted in this bug occurs once almost everyday. The two issues may be a manifestation of the same error.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.