https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6004/shard-skl6/igt@i915_selftest@mock_requests.html <4> [135.075742] ? i915_gem_free_object+0x110/0x110 [i915] <4> [135.076399] ? i915_gem_context_free+0xc1/0x240 [i915] <4> [135.076878] ? i915_gem_context_free+0xc1/0x240 [i915] <4> [135.077208] i915_gem_context_free+0xc1/0x240 [i915] <3> [135.235091] ODEBUG: free active (active state 0) object type: work_struct hint: __i915_gem_free_work+0x0/0x90 [i915] <4> [135.235654] CPU: 0 PID: 1037 Comm: i915_selftest Tainted: G U W 5.1.0-rc6-CI-CI_DRM_6004+ #1 <4> [135.237482] i915_request_mock_selftests+0x2a/0x70 [i915] <4> [135.238561] i915_mock_selftests+0x27/0x50 [i915] <4> [135.238989] i915_init+0x12/0x73 [i915] <4> [135.240929] i915_selftest/1037 is trying to acquire lock: <4> [135.241228] i915_request_mock_selftests+0x2a/0x70 [i915] <4> [135.241236] i915_mock_selftests+0x27/0x50 [i915] <4> [135.241240] i915_init+0x12/0x73 [i915] <4> [135.241358] 1 lock held by i915_selftest/1037: <4> [135.241386] CPU: 0 PID: 1037 Comm: i915_selftest Tainted: G U W 5.1.0-rc6-CI-CI_DRM_6004+ #1 <4> [135.241455] ? __i915_gem_free_objects+0x720/0x720 [i915] <4> [135.241462] ? __i915_gem_free_objects+0x720/0x720 [i915] <4> [135.241491] i915_request_mock_selftests+0x2a/0x70 [i915] <4> [135.241503] i915_mock_selftests+0x27/0x50 [i915] <4> [135.241506] i915_init+0x12/0x73 [i915]
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * SKL: igt@i915_selftest@mock_requests - incomplete - ODEBUG: free active (active state 0) object type: work_struct hint: __i915_gem_free_work+0x0/0x90 [i915] - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12876/shard-skl6/igt@i915_selftest@mock_requests.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4180/shard-skl1/igt@i915_selftest@mock_requests.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12883/shard-skl5/igt@i915_selftest@mock_requests.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6003/shard-skl5/igt@i915_selftest@mock_requests.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6004/shard-skl6/igt@i915_selftest@mock_requests.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12886/shard-skl2/igt@i915_selftest@mock_requests.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4216/shard-skl9/igt@i915_selftest@mock_requests.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12870/shard-skl5/igt@i915_selftest@mock_requests.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12873/shard-skl1/igt@i915_selftest@mock_requests.html
First time without a prior bug, the essence of the bug is that we have an object freed via rcu at the same time as we are trying to flush the free workqueue. Which the workqueue code objects to, for no clear reason. In this case, maybe mock is a little too quick with its drain? Normally we only drain on module unload.
commit dc76e5764a46ffb2e7f502a86b3288b5edcce191 (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed May 1 14:57:51 2019 +0100 drm/i915: Complete both freed-object passes before draining the workqueue The workqueue code complains viciously if we try to queue more work onto the queue while attampting to drain it. As we asynchronously free objects and defer their enqueuing with RCU, it is quite tricky to quiesce the system before attempting to drain the workqueue. Yet drain we must to ensure that the worker is idle before unloading the module. Give the freed object drain 3 whole passes with multiple rcu_barrier() to give the defer freeing of several levels each protected by RCU and needing a grace period before its parent can be freed, ultimately resulting in a GEM object being freed after another RCU period. A consequence is that it will make module unload even slower. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110550 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190501135753.8711-1-chris@chris-wilson.co.uk
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6092/shard-skl6/igt@i915_selftest@mock_objects.html <4> [2092.272416] ? i915_gem_free_object+0x110/0x110 [i915] <3> [2092.307531] ODEBUG: free active (active state 0) object type: work_struct hint: __i915_gem_free_work+0x0/0x90 [i915] <4> [2092.308094] CPU: 3 PID: 4341 Comm: i915_selftest Tainted: G U W 5.1.0-CI-CI_DRM_6092+ #1 <4> [2092.309901] i915_gem_object_mock_selftests+0x34/0x40 [i915] <4> [2092.311000] i915_mock_selftests+0x27/0x50 [i915] <4> [2092.311461] i915_init+0x12/0x73 [i915] <4> [2092.313570] i915_selftest/4341 is trying to acquire lock: <4> [2092.313770] drm_dbg+0x7f/0x90 <4> [2092.313903] i915_gem_object_mock_selftests+0x34/0x40 [i915] <4> [2092.313912] i915_mock_selftests+0x27/0x50 [i915] <4> [2092.313916] i915_init+0x12/0x73 [i915] <4> [2092.314053] 1 lock held by i915_selftest/4341: <4> [2092.314084] CPU: 3 PID: 4341 Comm: i915_selftest Tainted: G U W 5.1.0-CI-CI_DRM_6092+ #1 <4> [2092.314159] ? __i915_gem_free_objects+0x720/0x720 [i915] <4> [2092.314167] ? __i915_gem_free_objects+0x720/0x720 [i915] <4> [2092.314198] i915_gem_object_mock_selftests+0x34/0x40 [i915] <4> [2092.314210] i915_mock_selftests+0x27/0x50 [i915] <4> [2092.314214] i915_init+0x12/0x73 [i915]
A CI Bug Log filter associated to this bug has been updated: {- SKL: igt@i915_selftest@mock_requests - incomplete - ODEBUG: free active (active state 0) object type: work_struct hint: __i915_gem_free_work+0x0/0x90 [i915] -} {+ SKL: igt@i915_selftest@mock_requests|objects - incomplete - ODEBUG: free active (active state 0) object type: work_struct hint: __i915_gem_free_work+0x0/0x90 [i915] +} New failures caught by the filter: * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6092/shard-skl6/igt@i915_selftest@mock_objects.html
Another stab, commit 4fda44bf16b79a0b78fe36c6b9859e9ce2d09f43 (HEAD -> drm-intel-next-queued, drm-intel/for-linux-next, drm-intel/drm-intel-next-queued) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jul 3 18:19:13 2019 +0100 drm/i915: Flush the workqueue before draining Trying to drain a workqueue while we may still be adding to it from background tasks is, according to kernel/workqueue.c, verboten. So, add a flush_workqueue() at the start of our cleanup procedure. References: https://bugs.freedesktop.org/show_bug.cgi?id=110550 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190703171913.16585-4-chris@chris-wilson.co.uk
Not seen on drm tip. So closing and archiving this.
The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.