Summary: | [CI][SHARDS] igt@gem_exec_schedule@semaphore-codependency - fail - Failed assertion: !"GPU hung" | ||
---|---|---|---|
Product: | DRI | Reporter: | Martin Peres <martin.peres> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | RESOLVED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | high | CC: | intel-gfx-bugs |
Version: | XOrg git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | BXT, GLK, ICL, KBL | i915 features: | GEM/Other |
Description
Martin Peres
2019-04-10 13:07:50 UTC
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * APL KBL GLK ICL: igt@gem_exec_schedule@semaphore-codependency - fail - Failed assertion: !"GPU hung" - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2832/shard-apl6/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2832/shard-iclb6/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2832/shard-kbl1/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2833/shard-apl4/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2833/shard-glk8/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2833/shard-kbl2/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4937/shard-apl2/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4937/shard-glk6/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4937/shard-iclb6/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4937/shard-kbl4/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2834/shard-apl8/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2834/shard-glk5/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2834/shard-kbl2/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4936/shard-glk8/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4936/shard-iclb5/igt@gem_exec_schedule@semaphore-codependency.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4936/shard-kbl6/igt@gem_exec_schedule@semaphore-codependency.html Hindsight is perfect. If only I had thought of this test before writing the code. Fix all ready to go, https://patchwork.freedesktop.org/series/59232/ just waiting for a report from the media guys if it fixes their perf regression. Maybe they'll even get around to reporting a bug. A CI Bug Log filter associated to this bug has been updated: {- APL KBL GLK ICL: igt@gem_exec_schedule@semaphore-codependency - fail - Failed assertion: !"GPU hung" -} {+ SKL APL KBL GLK ICL: igt@gem_exec_schedule@semaphore-codependency - fail - Failed assertion: !"GPU hung" +} New failures caught by the filter: * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5898/shard-skl8/igt@gem_exec_schedule@semaphore-codependency.html commit b7404c7ecb38b66f103cec694e23a8e99252829e (HEAD -> drm-intel-next-queued, drm-intel/for-linux-next, drm-intel/drm-intel-next-queued) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 9 16:29:22 2019 +0100 drm/i915: Bump ready tasks ahead of busywaits Consider two tasks that are running in parallel on a pair of engines (vcs0, vcs1), but then must complete on a shared engine (rcs0). To maximise throughput, we want to run the first ready task on rcs0 (i.e. the first task that completes on either of vcs0 or vcs1). When using semaphores, however, we will instead queue onto rcs in submission order. To resolve this incorrect ordering, we want to re-evaluate the priority queue when each of the request is ready. Normally this happens because we only insert into the priority queue requests that are ready, but with semaphores we are inserting ahead of their readiness and to compensate we penalize those tasks with reduced priority (so that tasks that do not need to busywait should naturally be run first). However, given a series of tasks that each use semaphores, the queue degrades into submission fifo rather than readiness fifo, and so to counter this we give a small boost to semaphore users as their dependent tasks are completed (and so we no longer require any busywait prior to running the user task as they are then ready themselves). v2: Fixup irqsave for schedule_lock (Tvrtko) Testcase: igt/gem_exec_schedule/semaphore-codependency Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Dmitry Ermilov <dmitry.ermilov@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190409152922.23894-1-chris@chris-wilson.co.uk (In reply to Chris Wilson from comment #4) > commit b7404c7ecb38b66f103cec694e23a8e99252829e (HEAD -> > drm-intel-next-queued, drm-intel/for-linux-next, > drm-intel/drm-intel-next-queued) > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Tue Apr 9 16:29:22 2019 +0100 > > drm/i915: Bump ready tasks ahead of busywaits > > Consider two tasks that are running in parallel on a pair of engines > (vcs0, vcs1), but then must complete on a shared engine (rcs0). To > maximise throughput, we want to run the first ready task on rcs0 (i.e. > the first task that completes on either of vcs0 or vcs1). When using > semaphores, however, we will instead queue onto rcs in submission order. > > To resolve this incorrect ordering, we want to re-evaluate the priority > queue when each of the request is ready. Normally this happens because > we only insert into the priority queue requests that are ready, but with > semaphores we are inserting ahead of their readiness and to compensate > we penalize those tasks with reduced priority (so that tasks that do not > need to busywait should naturally be run first). However, given a series > of tasks that each use semaphores, the queue degrades into submission > fifo rather than readiness fifo, and so to counter this we give a small > boost to semaphore users as their dependent tasks are completed (and so > we no longer require any busywait prior to running the user task as they > are then ready themselves). > > v2: Fixup irqsave for schedule_lock (Tvrtko) > > Testcase: igt/gem_exec_schedule/semaphore-codependency > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> > Cc: Dmitry Ermilov <dmitry.ermilov@intel.com> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Link: > https://patchwork.freedesktop.org/patch/msgid/20190409152922.23894-1- > chris@chris-wilson.co.uk Thanks, this definitely fixed the issue! It used to fail multiple times per run (~3) and now not seen in 36 runs. The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.