Summary: | [CI][SHARDS] igt@i915_selftest@live_gem_contexts- dmesg-warn - igt_ctx_exec failed with error -5 | ||
---|---|---|---|
Product: | DRI | Reporter: | Lakshmi <lakshminarayana.vudum> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | RESOLVED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | not set | ||
Priority: | not set | CC: | intel-gfx-bugs |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | HSW | i915 features: |
Description
Lakshmi
2019-09-16 10:22:28 UTC
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * HSW: igt@i915_selftest@live_gem_contexts- dmesg-warn - igt_ctx_exec failed with error -5 - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4990/fi-hsw-4770r/igt@i915_selftest@live_gem_contexts.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6896/shard-hsw6/igt@i915_selftest@live_gem_contexts.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4958/fi-hsw-4770r/igt@i915_selftest@live_gem_contexts.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4959/fi-hsw-4770r/igt@i915_selftest@live_gem_contexts.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4964/fi-hsw-4770r/igt@i915_selftest@live_gem_contexts.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4965/fi-hsw-4770r/igt@i915_selftest@live_gem_contexts.html Failed context restore, similar to *cs++ = MI_LOAD_REGISTER_IMM(num_engines); for_each_engine(signaller, i915, id) { if (signaller == engine) continue; *cs++ = i915_mmio_reg_offset( RING_PSMI_CTL(signaller->mmio_base)); *cs++ = _MASKED_BIT_ENABLE( GEN6_PSMI_SLEEP_MSG_DISABLE); } which is only supposed to affect hsw-gt1. Petri says the shards use cpu: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz which is gt2 and is reflected in the lack of the w/a in the ringbuffer. We could just enable the w/a for all Haswell to be on the safe side. I'm betting on it being a resurgence of the PSMI issue, commit 56c05de6bd773b96deca379370965c49042b5fbf (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 17 20:47:46 2019 +0100 drm/i915: Extend Haswell GT1 PSMI workaround to all A few times in CI, we have detected a GPU hang on our Haswell GT2 systems with the characteristic IPEHR of 0x780c0000. When the PSMI w/a was first introducted, it was applied to all Haswell, but later on we found an erratum that supposedly restricted the issue to GT1 and so constrained it only be applied on GT1. That may have been a mistake... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111692 Fixes: 167bc759e823 ("drm/i915: Restrict PSMI context load w/a to Haswell GT1") References: 2c550183476d ("drm/i915: Disable PSMI sleep messages on all rings around context switches") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190917194746.26710-1-chris@chris-wilson.co.uk One interesting thought though was perhaps it is related to timing changes from iommu. A CI Bug Log filter associated to this bug has been updated: {- HSW: igt@i915_selftest@live_gem_contexts- dmesg-warn - igt_ctx_exec failed with error -5 -} {+ BYT HSW: igt@i915_selftest@live_gem_contexts- dmesg-warn - igt_ctx_exec failed with error -5 +} New failures caught by the filter: * https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5324/fi-byt-n2820/igt@i915_selftest@live_gem_contexts.html * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7481/fi-byt-j1900/igt@i915_selftest@live_gem_contexts.html * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7482/fi-byt-squawks/igt@i915_selftest@live_gem_contexts.html * https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5330/fi-byt-clapper/igt@i915_selftest@live_gem_contexts.html |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.