Summary: | [BAT] [BDW only] igt@* -incomplete timeout/system hang? | ||
---|---|---|---|
Product: | DRI | Reporter: | Marta Löfstedt <marta.lofstedt> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED WORKSFORME | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | high | CC: | intel-gfx-bugs, tomi.p.sarvela |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | BDW | i915 features: | display/Other |
Bug Depends on: | |||
Bug Blocks: | 105984 |
Description
Marta Löfstedt
2017-12-07 13:20:49 UTC
Possibly: commit 16c8619a7c53fe05526c31d4758be0eeabd16364 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Dec 19 22:09:16 2017 +0000 drm/i915: Avoid context dereference inside execlists_submission_tasklet A lesson that has to be relearnt over and over again is that the request does not keep a reference to the context and so we cannot freely dereference the context from inside the execlists_submission_tasklet. In particular, we try to do so in the new GEM_TRACE() so convert those over to the port->context_id we keep for GEM debugging. This means the tracing now depends on DRM_I915_GEM_DEBUG. Fixes: bccd3b831185 ("drm/i915: Use trace_printk to provide a death rattle for GEM") References: https://bugs.freedesktop.org/show_bug.cgi?id=104066 References: https://bugs.freedesktop.org/show_bug.cgi?id=104162 References: https://bugs.freedesktop.org/show_bug.cgi?id=104242 References: https://bugs.freedesktop.org/show_bug.cgi?id=104310 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171219220916.30882-1-chris@chris-wilson.co.uk Unfortunately reproduced on: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3602/fi-bdw-5557u/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3605/fi-bdw-5557u/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4141/fi-bdw-5557u/igt@gem_exec_suspend@basic-s4-devices.html pstore is just a bunch of: <0>[ 308.200127] gem_exec-2999 1..s1 223027120us : execlists_submission_tasklet: rcs0 in[0]: ctx=7.1, seqno=821 This is a Meta bug for incompletes on BDW. https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4352/fi-bdw-5557u/igt@gem_exec_suspend@basic-s3.html run.log: pass: igt/gem_exec_suspend/basic-s3 [108/288] skip: 3, pass: 105 \ FATAL: command execution failed ... Completed CI_IGT_test CI_DRM_3916/fi-bdw-5557u/0 : FAILURE CI_IGT_test runtime 328 seconds Rebooting fi-bdw-5557u Last dmesg: <7>[ 194.262113] [IGT] gem_exec_suspend: executing <4>[ 194.271639] Setting dangerous option reset - tainting kernel <7>[ 194.275272] [IGT] gem_exec_suspend: starting subtest basic-S3 Followed by "stray" pstore just shows a sysrq-trigger backtrace unfortunately the reset is overwritten by ftrace. https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_3/fi-bdw-5557u/igt@kms_cursor_crc@cursor-256x256-suspend.html run.log: running: igt/kms_cursor_crc/cursor-256x256-suspend [74/97] skip: 37, pass: 37 - FATAL: command execution failed ... Completed CI_IGT_test drmtip_3/fi-bdw-5557u/27 : FAILURE CI_IGT_test runtime 333 seconds Rebooting fi-bdw-5557u Last dmesg: <7>[ 136.677451] [drm:verify_connector_state.isra.78 [i915]] [CONNECTOR:76:HDMI-A-2] <7>[ 136.677522] [drm:intel_atomic_commit_tail [i915]] [CRTC:51:pipe B] <7>[ 136.677600] [drm:verify_single_dpll_state.isra.79 [i915]] WRPLL 1 pstore overwritten by ftrace https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_3/fi-bdw-5557u/igt@gem_ctx_isolation@bcs0-s3.html run.log: pass: igt/gem_ctx_isolation/bcs0-s3 [15/97] skip: 8, pass: 7 - FATAL: command execution failed ... Completed CI_IGT_test drmtip_3/fi-bdw-5557u/34 : FAILURE CI_IGT_test runtime 240 seconds Rebooting fi-bdw-5557u last dmesg: <4>[ 40.493107] Setting dangerous option reset - tainting kernel <7>[ 40.497149] [IGT] gem_ctx_isolation: starting subtest bcs0-S3 <6>[ 40.567914] PM: suspend entry (deep) https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_4/fi-bdw-gvtdvm/igt@gem_ctx_switch@basic-all-heavy.html run.log: running: igt/gem_ctx_switch/basic-all-heavy [39/97] skip: 20, pass: 19 \ FATAL: command execution failed ... Completed CI_IGT_test drmtip_4/fi-bdw-gvtdvm/25 : FAILURE CI_IGT_test runtime 600 seconds Rebooting fi-bdw-gvtdvm Last dmesg: <7>[ 87.178419] [IGT] gem_ctx_switch: starting subtest basic-all-heavy <6>[ 186.052216] perf: interrupt took too long (8296 > 8205), lowering kernel.perf_event_max_sample_rate to 24000 <6>[ 346.791638] perf: interrupt took too long (10381 > 10370), lowering kernel.perf_event_max_sample_rate to 19000 https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_3/fi-bdw-gvtdvm/igt@gem_exec_reuse@contexts.html run.log: running: igt/gem_exec_reuse/contexts [06/97] skip: 2, pass: 3, fail: 1 - FATAL: command execution failed ... Completed CI_IGT_test drmtip_3/fi-bdw-gvtdvm/11 : FAILURE CI_IGT_test runtime 462 seconds Rebooting fi-bdw-gvtdvm Last dmesg: <7>[ 47.105282] [IGT] gem_exec_reuse: executing <4>[ 47.156940] Setting dangerous option reset - tainting kernel <6>[ 47.222774] perf: interrupt took too long (5310 > 5207), lowering kernel.perf_event_max_sample_rate to 37000 <7>[ 48.749247] [IGT] gem_exec_reuse: starting subtest contexts <6>[ 56.215270] perf: interrupt took too long (6707 > 6637), lowering kernel.perf_event_max_sample_rate to 29000 <6>[ 67.716713] perf: interrupt took too long (8462 > 8383), lowering kernel.perf_event_max_sample_rate to 23000 I guess can be closed? Not seen on CI lately. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.