Summary: | [HSW]drv_selftest/mock_requests - ida_remove called for id=1 which is not allocated. | ||
---|---|---|---|
Product: | DRI | Reporter: | mwa <matthew.auld> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | intel-gfx-bugs |
Version: | DRI git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | HSW | i915 features: | GEM/Other |
Description
mwa
2017-06-28 14:33:51 UTC
The usual case for this was the worker running after i915_gem_context_fini(). The requirement there was that flush_workqueue() was run before i915_gem_context_fini(). i.e. we do static void mock_device_release(struct drm_device *dev) { struct drm_i915_private *i915 = to_i915(dev); struct intel_engine_cs *engine; enum intel_engine_id id; mutex_lock(&i915->drm.struct_mutex); mock_device_flush(i915); i915_gem_contexts_lost(i915); mutex_unlock(&i915->drm.struct_mutex); cancel_delayed_work_sync(&i915->gt.retire_work); cancel_delayed_work_sync(&i915->gt.idle_work); flush_workqueue(i915->wq); mutex_lock(&i915->drm.struct_mutex); for_each_engine(engine, i915, id) mock_engine_free(engine); i915_gem_contexts_fini(i915); mutex_unlock(&i915->drm.struct_mutex); ... (I'm assuming that the llist is correct and we can't free the same context twice, kasan?) Above though we have an issue in that we need to use drain_workqueue() instead. Otherwise, I can't see where we might have more requests to retire. Probably deserves a few bug ons to assert that all requests are retired and we are now idle, ala i915_gem_suspend(). Hmm, probably better to rework it to use i915_gem_suspend as well. Yeah i915_gem_suspend() seems to do the trick, though not sure what to do about the gt.awake and i915_gem_sanitize() business. commit 3b19f16a556446c144a1f921444931b0cf9447ab Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Jul 18 14:41:24 2017 +0100 drm/i915: Drain the device workqueue on unload Workers on the i915->wq may rearm themselves so for completeness we need to replace our flush_workqueue() with a call to drain_workqueue() before unloading the device. v2: Reinforce the drain_workqueue with an preceding rcu_barrier() as a few of the tasks that need to be drained may first be armed by RCU. Should fix it in the short term. Reusing i915_gem_suspend() can hopefully be done at leisure. Hmm, still hitting the issue... Do we hit diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c index 5b18a2dc19a8..df2cc9132d90 100644 --- a/drivers/gpu/drm/i915/selftests/mock_engine.c +++ b/drivers/gpu/drm/i915/selftests/mock_engine.c @@ -195,7 +195,7 @@ void mock_engine_free(struct intel_engine_cs *engine) GEM_BUG_ON(timer_pending(&mock->hw_delay)); - if (engine->last_retired_context) + if (WARN_ON(engine->last_retired_context)) engine->context_unpin(engine, engine->last_retired_context); intel_engine_fini_breadcrumbs(engine); ? That should have been done by the i915_gem_contexts_lost() following the device flush. Yes. Ah, you don't have contexts_lost there because I added it later: commit b8d0658849d52110c72b44860f86fb4c544de625 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Apr 20 11:52:34 2017 +0100 drm/i915: Keep a recent cache of freed contexts objects for reuse Keep the recently freed context objects for reuse. This allows us to use the current GGTT bindings and dma bound pages, avoiding any clflushes as required. We mark the objects as purgeable under memory pressure, and reap the list of freed objects as soon as the device is idle. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c index d451dfbe9bbb..dda413c95b89 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c @@ -54,6 +54,7 @@ static void mock_device_release(struct drm_device *dev) mutex_lock(&i915->drm.struct_mutex); mock_device_flush(i915); + i915_gem_contexts_lost(i915); mutex_unlock(&i915->drm.struct_mutex); cancel_delayed_work_sync(&i915->gt.retire_work); which explains why I haven't it... commit 56d27666f8fa21835724217b0c67d42b769b5723 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jul 19 14:59:57 2017 +0100 drm/i915/selftests: Mark contexts as lost during freeing of mock device We need to unpin the last retired context early in the shutdown sequence so that its RCU free is done before we try to free the context ida. I included this in a later patch ("drm/i915: Keep a recent cache of freed contexts objects for reuse") and so missed that the selftests were broken in the meantime. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.