112153 – [CI][SHARDS] igt@gem_* - dmesg-warn - WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected, acquire at: __mutex_unlock_slowpath, holding at: intel_gt_retire_requests_timeout

Bug 112153 - [CI][SHARDS] igt@gem_* - dmesg-warn - WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected, acquire at: __mutex_unlock_slowpath, holding at: intel_gt_retire_requests_timeout

Summary: [CI][SHARDS] igt@gem_* - dmesg-warn - WARNING: HARDIRQ-safe -> HARDIRQ-unsafe...

Status:	RESOLVED DUPLICATE of bug 111626

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	DRI git
Hardware:	Other All

Importance:	not set not set
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2019-10-28 14:17 UTC by Lakshmi
Modified:	2019-10-28 14:21 UTC (History)
CC List:	1 user (show)

See Also:
i915 platform:	BXT, BYT
i915 features:	GEM/Other

Attachments

Description Lakshmi 2019-10-28 14:17:12 UTC

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7186/shard-apl8/igt@gem_busy@close-race.html
<4> [1776.577169] =====================================================
<4> [1776.577176] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
<4> [1776.577184] 5.4.0-rc4-CI-CI_DRM_7186+ #1 Tainted: G     U           
<4> [1776.577190] -----------------------------------------------------
<4> [1776.577197] kworker/3:4/1244 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
<4> [1776.577205] ffff888276ad5030 (&(&lock->wait_lock)->rlock){+.+.}, at: __mutex_unlock_slowpath+0x18e/0x2b0
<4> [1776.577222] 
and this task is already holding:
<4> [1776.577228] ffff88825508c2c8 (&(&timelines->lock)->rlock){-...}, at: intel_gt_retire_requests_timeout+0x180/0x540 [i915]
<4> [1776.577350] which would create a new lock dependency:
<4> [1776.577355]  (&(&timelines->lock)->rlock){-...} -> (&(&lock->wait_lock)->rlock){+.+.}
<4> [1776.577365] 
but this new dependency connects a HARDIRQ-irq-safe lock:
<4> [1776.577372]  (&(&timelines->lock)->rlock){-...}
<4> [1776.577374] 
... which became HARDIRQ-irq-safe at:
<4> [1776.577387]   lock_acquire+0xa7/0x1c0
<4> [1776.577394]   _raw_spin_lock_irqsave+0x33/0x50
<4> [1776.577488]   intel_timeline_enter+0x64/0x150 [i915]
<4> [1776.577576]   __engine_park+0x1ef/0x420 [i915]
<4> [1776.577660]   ____intel_wakeref_put_last+0x1c/0x70 [i915]
<4> [1776.577747]   i915_sample+0x2de/0x300 [i915]
<4> [1776.577754]   __hrtimer_run_queues+0x121/0x4a0
<4> [1776.577760]   hrtimer_interrupt+0xea/0x250
<4> [1776.577766]   smp_apic_timer_interrupt+0x96/0x280
<4> [1776.577772]   apic_timer_interrupt+0xf/0x20
<4> [1776.577778]   lock_release+0x17d/0x2a0
<4> [1776.577784]   _raw_spin_unlock+0x17/0x40
<4> [1776.577882]   release_pd_entry+0x67/0x130 [i915]
<4> [1776.577981]   __gen8_ppgtt_clear+0x20b/0x550 [i915]
<4> [1776.578080]   __gen8_ppgtt_clear+0x2d9/0x550 [i915]
<4> [1776.578181]   __i915_vma_unbind.part.39+0xb5/0x460 [i915]
<4> [1776.578283]   i915_vma_destroy+0x105/0x1e0 [i915]
<4> [1776.578378]   __i915_gem_free_objects+0x213/0x3e0 [i915]
<4> [1776.578386]   process_one_work+0x26a/0x620
<4> [1776.578391]   worker_thread+0x37/0x380
<4> [1776.578397]   kthread+0x119/0x130
<4> [1776.578403]   ret_from_fork+0x3a/0x50
<4> [1776.578408] 
to a HARDIRQ-irq-unsafe lock:
<4> [1776.578414]  (&(&lock->wait_lock)->rlock){+.+.}
<4> [1776.578415] 
... which became HARDIRQ-irq-unsafe at:
<4> [1776.578425] ...
<4> [1776.578428]   lock_acquire+0xa7/0x1c0
<4> [1776.578436]   _raw_spin_lock+0x2a/0x40
<4> [1776.578442]   __mutex_lock+0x198/0x9d0
<4> [1776.578449]   pipe_wait+0x8f/0xc0
<4> [1776.578454]   pipe_read+0x235/0x310
<4> [1776.578460]   new_sync_read+0x10f/0x1a0
<4> [1776.578465]   vfs_read+0x96/0x160
<4> [1776.578470]   ksys_read+0x9f/0xe0
<4> [1776.578477]   do_syscall_64+0x4f/0x210
<4> [1776.578483]   entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [1776.578489] 
other info that might help us debug this:

<4> [1776.578497]  Possible interrupt unsafe locking scenario:

<4> [1776.578503]        CPU0                    CPU1
<4> [1776.578508]        ----                    ----
<4> [1776.578513]   lock(&(&lock->wait_lock)->rlock);
<4> [1776.578519]                                local_irq_disable();
<4> [1776.578524]                                lock(&(&timelines->lock)->rlock);
<4> [1776.578531]                                lock(&(&lock->wait_lock)->rlock);
<4> [1776.578539]   <Interrupt>
<4> [1776.578542]     lock(&(&timelines->lock)->rlock);
<4> [1776.578548]

Comment 1 CI Bug Log 2019-10-28 14:18:09 UTC

The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* APL BYT: igt@gem_* - dmesg-warn - WARNING: HARDIRQ-safe -&gt; HARDIRQ-unsafe lock order detected, acquire at: __mutex_unlock_slowpath, holding at: intel_gt_retire_requests_timeout
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_392/fi-byt-j1900/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7186/shard-apl8/igt@gem_busy@close-race.html

Comment 2 Chris Wilson 2019-10-28 14:21:15 UTC


*** This bug has been marked as a duplicate of bug 111626 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.