Bug 108660 - [CI][DRMTIP] igt@prime_mmap_coherency@read - incomplete - INFO: rcu_preempt self-detected stall on CPU
Summary: [CI][DRMTIP] igt@prime_mmap_coherency@read - incomplete - INFO: rcu_preempt s...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-11-05 12:25 UTC by Martin Peres
Modified: 2019-02-07 08:48 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features:


Attachments

Description Martin Peres 2018-11-05 12:25:04 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_137/fi-icl-u/igt@prime_mmap_coherency@read.html

<7> [197.633052] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 02
<3> [312.295779] rcu: INFO: rcu_preempt self-detected stall on CPU
<3> [312.295976] rcu: \x097-...!: (1 ticks this GP) idle=31a/0/0x1 softirq=10012/10012 fqs=0 
<3> [312.296182] rcu: \x09 (t=111850 jiffies g=36013 q=73)
<3> [312.296209] rcu: rcu_preempt kthread starved for 111850 jiffies! g36013 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=4
<3> [312.296252] rcu: RCU grace-period kthread stack dump:
<6> [312.296275] rcu_preempt     I14344    10      2 0x80000000
<4> [312.296283] Call Trace:
<4> [312.296294]  ? __schedule+0x364/0xb40
<4> [312.296298]  ? schedule_timeout+0x228/0x4f0
<4> [312.296307]  schedule+0x2d/0x90
<4> [312.296311]  schedule_timeout+0x9a/0x4f0
<4> [312.296319]  ? collect_expired_timers+0xa0/0xa0
<4> [312.296330]  rcu_gp_kthread+0x6c9/0xb00
<4> [312.296342]  ? __rcu_read_unlock+0x70/0x70
<4> [312.296347]  kthread+0x119/0x130
<4> [312.296352]  ? kthread_park+0x80/0x80
<4> [312.296357]  ret_from_fork+0x3a/0x50
<6> [312.296376] Sending NMI from CPU 7 to CPUs 0:
<4> [200.448298] NMI backtrace for cpu 0 skipped: idling at acpi_processor_ffh_cstate_enter+0x64/0xb0
<6> [312.297465] Sending NMI from CPU 7 to CPUs 1:
<4> [200.449391] NMI backtrace for cpu 1 skipped: idling at acpi_processor_ffh_cstate_enter+0x64/0xb0
<6> [312.298475] Sending NMI from CPU 7 to CPUs 3:
<4> [312.299508] NMI backtrace for cpu 3
<4> [312.299510] CPU: 3 PID: 1 Comm: systemd Tainted: G     U  W         4.19.0-ga4e9f377a9b5-drmtip_137+ #1
<4> [312.299511] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2402.AD3.1810170014 10/17/2018
<4> [312.299512] RIP: 0010:queued_spin_lock_slowpath+0x161/0x1c0
<4> [312.299514] Code: 5d 41 5c c3 f3 90 8b 03 84 c0 75 f8 e9 f6 fe ff ff be 01 00 00 00 f0 0f b1 33 85 c0 0f 85 59 ff ff ff eb c8 31 f6 eb 02 f3 90 <8b> 03 66 85 c0 75 f7 89 c7 66 31 ff 39 cf 75 a3 b9 01 00 00 00 f0
<4> [312.299515] RSP: 0018:ffff9dc8efec3eb0 EFLAGS: 00000002
<4> [312.299517] RAX: 0000000000100101 RBX: ffffffff96247d40 RCX: 0000000000100000
<4> [312.299518] RDX: ffff9dc8efee2ac0 RSI: 0000000000000000 RDI: ffffffff960a3d8e
<4> [312.299519] RBP: 0000000000000000 R08: 0000000098fa0aea R09: 0000000000000000
<4> [312.299520] R10: ffff9dc8efec3e78 R11: ffffffff96247d58 R12: ffff9dc8efee2ac0
<4> [312.299521] R13: ffffffff96247b40 R14: ffffffff96247d40 R15: 0000000000000002
<4> [312.299522] FS:  00007f68ce9d8e00(0000) GS:ffff9dc8efec0000(0000) knlGS:0000000000000000
<4> [312.299523] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [312.299524] CR2: 0000564ec40f7318 CR3: 00000004a94aa006 CR4: 0000000000760ee0
<4> [312.299525] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4> [312.299526] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4> [312.299527] PKRU: 55555554
<4> [312.299527] Call Trace:
<4> [312.299528]  <IRQ>
<4> [312.299529]  do_raw_spin_lock+0xa8/0xb0
<4> [312.299530]  _raw_spin_lock_irqsave+0x3b/0x50
<4> [312.299531]  ? rcu_process_callbacks+0x4ec/0x850
<4> [312.299532]  rcu_process_callbacks+0x4ec/0x850
<4> [312.299532]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
<4> [312.299533]  __do_softirq+0xd8/0x48e
<4> [312.299534]  irq_exit+0xa9/0xc0
<4> [312.299535]  smp_apic_timer_interrupt+0x9c/0x250
<4> [312.299536]  apic_timer_interrupt+0xf/0x20
<4> [312.299536]  </IRQ>
<4> [312.299537] RIP: 0010:lock_release+0x14c/0x2e0
<4> [312.299539] Code: 0f 85 73 01 00 00 65 48 8b 3c 25 00 4f 01 00 e8 0a cb ff ff 65 48 8b 04 25 00 4f 01 00 c7 80 7c 08 00 00 00 00 00 00 41 54 9d <48> 8b 44 24 10 65 48 33 04 25 28 00 00 00 0f 85 70 01 00 00 48 83
<4> [312.299540] RSP: 0018:ffff9fe84004fb78 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
<4> [312.299542] RAX: ffff9dc8ee260040 RBX: ffff9dc8ee260040 RCX: 0000000000000000
<4> [312.299543] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9dc8ee260040
<4> [312.299544] RBP: ffffffff96247be0 R08: 0000000000000001 R09: 0000000000000001
<4> [312.299545] R10: ffff9fe84004fb50 R11: ffffffff96247be0 R12: 0000000000000246
<4> [312.299546] R13: ffffffff9529b8bf R14: 0000000000000000 R15: 0000000000000000
<4> [312.299547]  ? task_dump_owner+0x9f/0x210
<4> [312.299548]  task_dump_owner+0xbc/0x210
<4> [312.299548]  pid_revalidate+0x37/0x70
<4> [312.299549]  lookup_fast+0x2e2/0x470
<4> [312.299550]  walk_component+0x43/0x2d0
<4> [312.299551]  link_path_walk.part.10+0x2b6/0x550
<4> [312.299552]  ? path_init+0x3db/0x510
<4> [312.299553]  ? percpu_counter_add_batch+0x7e/0xb0
<4> [312.299553]  path_openat+0x8b/0xaa0
<4> [312.299554]  do_filp_open+0x96/0x110
<4> [312.299555]  ? __alloc_fd+0xe0/0x1e0
<4> [312.299556]  ? do_sys_open+0x1b8/0x240
<4> [312.299557]  do_sys_open+0x1b8/0x240
<4> [312.299557]  do_syscall_64+0x55/0x190
<4> [312.299558]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [312.299559] RIP: 0033:0x7f68ce4f8d19
<4> [312.299561] Code: 89 54 24 08 e8 e8 09 02 00 8b 74 24 0c 48 8b 3c 24 41 89 c0 44 8b 54 24 08 b8 01 01 00 00 89 f2 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 08 e8 13 0a 02 00 8b 44
<4> [312.299562] RSP: 002b:00007fff0b03f4f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
<4> [312.299563] RAX: ffffffffffffffda RBX: 0000564ec42bac70 RCX: 00007f68ce4f8d19
<4> [312.299564] RDX: 0000000000080000 RSI: 00007fff0b03f5f0 RDI: 00000000ffffff9c
<4> [312.299565] RBP: 00007f68ce0f6c78 R08: 0000000000000000 R09: 00007f68ce0f6c7a
<4> [312.299566] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000008
<4> [312.299568] R13: 00007f68ce0f6c78 R14: 00000000000000cc R15: 00007fff0b03fed0
<6> [312.299572] Sending NMI from CPU 7 to CPUs 4:
<4> [200.451477] NMI backtrace for cpu 4 skipped: idling at acpi_processor_ffh_cstate_enter+0x64/0xb0
<6> [312.300582] Sending NMI from CPU 7 to CPUs 5:
<4> [200.452488] NMI backtrace for cpu 5 skipped: idling at acpi_processor_ffh_cstate_enter+0x64/0xb0
<6> [312.301592] Sending NMI from CPU 7 to CPUs 6:
<4> [312.302600] NMI backtrace for cpu 6
<4> [312.302601] CPU: 6 PID: 10 Comm: rcu_preempt Tainted: G     U  W         4.19.0-ga4e9f377a9b5-drmtip_137+ #1
<4> [312.302603] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2402.AD3.1810170014 10/17/2018
<4> [312.302604] RIP: 0010:queued_spin_lock_slowpath+0x13d/0x1c0
<4> [312.302605] Code: 5d 48 85 f6 c6 03 01 74 64 c7 46 08 01 00 00 00 48 c7 c7 22 7d 0f 96 e8 d1 70 3a 00 65 ff 0d b6 56 f3 6a 5b 5d 41 5c c3 f3 90 <8b> 03 84 c0 75 f8 e9 f6 fe ff ff be 01 00 00 00 f0 0f b1 33 85 c0
<4> [312.302606] RSP: 0018:ffff9fe8400e3e08 EFLAGS: 00000002
<4> [312.302608] RAX: 0000000000100101 RBX: ffffffff96247d40 RCX: 0000000000000101
<4> [312.302609] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffff96247d40
<4> [312.302610] RBP: 0000000000000246 R08: 0000000098fa0aea R09: 0000000000000000
<4> [312.302611] R10: ffff9fe8400e3dd0 R11: ffffffff96247d58 R12: ffffffff96248158
<4> [312.302612] R13: 0000000000000000 R14: ffff9fe8400e3ec8 R15: 0000000000000000
<4> [312.302613] FS:  0000000000000000(0000) GS:ffff9dc8eff80000(0000) knlGS:0000000000000000
<4> [312.302614] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [312.302615] CR2: 00007f145dd36050 CR3: 00000001e7210003 CR4: 0000000000760ee0
<4> [312.302616] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4> [312.302617] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4> [312.302618] PKRU: 55555554
<4> [312.302619] Call Trace:
<4> [312.302619]  do_raw_spin_lock+0xa8/0xb0
<4> [312.302620]  _raw_spin_lock_irqsave+0x3b/0x50
<4> [312.302621]  ? force_qs_rnp+0xb4/0x180
<4> [312.302622]  force_qs_rnp+0xb4/0x180
<4> [312.302623]  ? dump_blkd_tasks.isra.43+0x1e0/0x1e0
<4> [312.302624]  rcu_gp_kthread+0x5c2/0xb00
<4> [312.302624]  ? __rcu_read_unlock+0x70/0x70
<4> [312.302625]  kthread+0x119/0x130
<4> [312.302626]  ? kthread_park+0x80/0x80
<4> [312.302627]  ret_from_fork+0x3a/0x50
<4> [312.302629] NMI backtrace for cpu 7
<4> [312.302633] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G     U  W         4.19.0-ga4e9f377a9b5-drmtip_137+ #1
<4> [312.302636] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2402.AD3.1810170014 10/17/2018
<4> [312.302638] Call Trace:
<4> [312.302641]  <IRQ>
<4> [312.302646]  dump_stack+0x67/0x9b
<4> [312.302651]  nmi_cpu_backtrace+0x8a/0x90
<4> [312.302656]  ? lapic_can_unplug_cpu+0x90/0x90
<4> [312.302661]  nmi_trigger_cpumask_backtrace+0xb7/0xf0
<4> [312.302667]  rcu_dump_cpu_stacks+0x9d/0xcb
<4> [312.302673]  rcu_check_callbacks+0x6ed/0xa10
<4> [312.302684]  ? tick_sched_handle.isra.6+0x60/0x60
<4> [312.302688]  update_process_times+0x23/0x50
<4> [312.302693]  tick_sched_timer+0x36/0x70
<4> [312.302699]  __hrtimer_run_queues+0x11e/0x4a0
<4> [312.302710]  hrtimer_interrupt+0xea/0x250
<4> [312.302719]  smp_apic_timer_interrupt+0x7b/0x250
<4> [312.302724]  apic_timer_interrupt+0xf/0x20
<4> [312.302727]  </IRQ>
<4> [312.302732] RIP: 0010:cpuidle_enter_state+0xab/0x340
<4> [312.302736] Code: 44 00 00 31 ff e8 95 d9 93 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 70 02 00 00 31 ff e8 be 7c 9a ff e8 99 4e 9e ff fb 4c 29 fb <48> ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7 ea b8 ff
<4> [312.302738] RSP: 0018:ffff9fe84013be90 EFLAGS: 00000216 ORIG_RAX: ffffffffffffff13
<4> [312.302744] RAX: ffff9dc8ee3c0040 RBX: 0000001a0ab96453 RCX: 0000000000000000
<4> [312.302747] RDX: 0000000000000046 RSI: ffffffff960f7d22 RDI: ffffffff960a3d9f
<4> [312.302750] RBP: 0000000000000002 R08: 0000000000000002 R09: 0000000000000000
<4> [312.302752] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9dc8ea869548
<4> [312.302755] R13: ffffffff9629c4d8 R14: 0000000000000000 R15: 0000002eab8de47e
<4> [312.302775]  do_idle+0x1f3/0x260
<4> [312.302783]  cpu_startup_entry+0x6a/0x70
<4> [312.302789]  start_secondary+0x19d/0x1f0
<4> [312.302794]  secondary_startup_64+0xa4/0xb0
<6> [200.456704] [IGT] prime_mmap_coherency: exiting, ret=0
Comment 1 Chris Wilson 2018-11-05 12:38:32 UTC
Look at the clock skew! One processor thinks it is 200s, the other 319s. prime_mmap_coherency shouldn't have a deadlock, and the stall information doesn't give any clues back to drm, so off hand I'd say this isn't ours.

It would be nice to have the full dmesg here...
Comment 2 Jani Saarinen 2019-01-18 22:52:20 UTC
This happend only once, should we close this?
Comment 3 Lakshmi 2019-02-03 21:17:30 UTC
Last seen on drmtip_137 (3 months / 1890 runs ago). 
Closing this issue.
Comment 4 CI Bug Log 2019-02-07 08:48:55 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.