https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6333/shard-glk9/igt@gem_ctx_engines@independent.html Starting subtest: independent (gem_ctx_engines:3353) CRITICAL: Test assertion failure function independent, file ../tests/i915/gem_ctx_engines.c:485: (gem_ctx_engines:3353) CRITICAL: Failed assertion: (map[i] - last) > 0 (gem_ctx_engines:3353) CRITICAL: Engine instance [2] executed too late
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * GLK: igt@gem_ctx_engines@independent - fail - Engine instance [2] executed too late - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6333/shard-glk9/igt@gem_ctx_engines@independent.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13219/shard-glk4/igt@gem_ctx_engines@independent.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13255/shard-glk7/igt@gem_ctx_engines@independent.html
There's a danger here as the test assumes in-order execution without explicit fencing (or else it is hard to say that each channel is independent), but we may trigger a timeslice evaluation in the middle and reorder. The goal of the test is to say that the engine[] are distinct and have no inherent common timeline (i.e. they all have their own rings and timelines). So we set them up with a fence that encourages them to execute in the opposite order to submission. Easiest way forward then would be to trickle feed the fences.
Seen once in a month, although out of only 8 runs it would seem. Based on this, and the description of the test, I'm setting the priority to medium.
https://patchwork.freedesktop.org/patch/320668/?series=64451&rev=1
I claim commit bfd7241fa594d772e1414574e09d1e4d9fa6643a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jun 26 11:26:10 2019 +0100 i915/gem_ctx_engine: Drip feed requests into 'independent' The intent of the test is to exercise that each channel in the engine[] is an independent context/ring/timeline. It setups 64 channels pointing to rcs0 and then submits one request to each in turn waiting on a timeline that will force them to run out of submission order. They can only run in fence order and not submission order if the timelines of each channel are truly independent. However, we released the fences en masse, and once the requests are ready they are independent and may be executed in any order by the HW, especially true with timeslicing that may reorder the requests on a whim. So instead of releasing all requests at once, increment the timeline step by step and check we get our results advancing. If the requests can not be run in fence order and fall back to submission order, we will time out waiting for our incremental results and trigger a few GPU hangs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110987 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> is the fix here.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.