Bug 104162 - [BAT] [BDW only] igt@* -incomplete timeout/system hang?
Summary: [BAT] [BDW only] igt@* -incomplete timeout/system hang?
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 105984
  Show dependency treegraph
 
Reported: 2017-12-07 13:20 UTC by Marta Löfstedt
Modified: 2018-05-09 14:40 UTC (History)
2 users (show)

See Also:
i915 platform: BDW
i915 features: display/Other


Attachments

Description Marta Löfstedt 2017-12-07 13:20:49 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3473/fi-bdw-5557u/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html

last dmesg:
<7>[  377.185155] [drm:verify_single_dpll_state.isra.78 [i915]] WRPLL 1
<7>[  377.302271] [drm:drm_mode_addfb2] [FB:76]
<7>[  377.313229] [drm:drm_mode_setcrtc] [CRTC:57:pipe C]
<7>[  377.313283] [drm:drm_mode_setcrtc] [CONNECTOR:68:HDMI-A-2]


run.log:
running: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-a

[243/288] skip: 17, pass: 226 \                        
FATAL: command execution failed
java.io.EOFException
...
Completed CI_IGT_test CI_DRM_3473/fi-bdw-5557u/0 : FAILURE
CI_IGT_test runtime 560 seconds
Rebooting fi-bdw-5557u

NOTE, run.logs last test case doesn't match the issue.
Comment 1 Chris Wilson 2017-12-19 23:12:23 UTC
Possibly:

commit 16c8619a7c53fe05526c31d4758be0eeabd16364
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 19 22:09:16 2017 +0000

    drm/i915: Avoid context dereference inside execlists_submission_tasklet
    
    A lesson that has to be relearnt over and over again is that the request
    does not keep a reference to the context and so we cannot freely
    dereference the context from inside the execlists_submission_tasklet. In
    particular, we try to do so in the new GEM_TRACE() so convert those over
    to the port->context_id we keep for GEM debugging. This means the
    tracing now depends on DRM_I915_GEM_DEBUG.
    
    Fixes: bccd3b831185 ("drm/i915: Use trace_printk to provide a death rattle for GEM")
    References: https://bugs.freedesktop.org/show_bug.cgi?id=104066
    References: https://bugs.freedesktop.org/show_bug.cgi?id=104162
    References: https://bugs.freedesktop.org/show_bug.cgi?id=104242
    References: https://bugs.freedesktop.org/show_bug.cgi?id=104310
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Reviewed-by: Michel Thierry <michel.thierry@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20171219220916.30882-1-chris@chris-wilson.co.uk
Comment 3 Marta Löfstedt 2018-01-16 13:05:15 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4141/fi-bdw-5557u/igt@gem_exec_suspend@basic-s4-devices.html

pstore is just a bunch of:
<0>[  308.200127] gem_exec-2999    1..s1 223027120us : execlists_submission_tasklet: rcs0 in[0]:  ctx=7.1, seqno=821
Comment 4 Marta Löfstedt 2018-01-25 12:44:26 UTC
This is a Meta bug for incompletes on BDW.
Comment 5 Marta Löfstedt 2018-03-13 11:46:26 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4352/fi-bdw-5557u/igt@gem_exec_suspend@basic-s3.html

run.log:
pass: igt/gem_exec_suspend/basic-s3

[108/288] skip: 3, pass: 105 \
FATAL: command execution failed
...
Completed CI_IGT_test CI_DRM_3916/fi-bdw-5557u/0 : FAILURE
CI_IGT_test runtime 328 seconds
Rebooting fi-bdw-5557u

Last dmesg:
<7>[  194.262113] [IGT] gem_exec_suspend: executing
<4>[  194.271639] Setting dangerous option reset - tainting kernel
<7>[  194.275272] [IGT] gem_exec_suspend: starting subtest basic-S3
Followed by "stray"

pstore just shows a sysrq-trigger backtrace unfortunately the reset is overwritten by ftrace.
Comment 6 Marta Löfstedt 2018-03-20 06:39:13 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_3/fi-bdw-5557u/igt@kms_cursor_crc@cursor-256x256-suspend.html

run.log:
running: igt/kms_cursor_crc/cursor-256x256-suspend

[74/97] skip: 37, pass: 37 -                      
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_3/fi-bdw-5557u/27 : FAILURE
CI_IGT_test runtime 333 seconds
Rebooting fi-bdw-5557u

Last dmesg:
<7>[  136.677451] [drm:verify_connector_state.isra.78 [i915]] [CONNECTOR:76:HDMI-A-2]
<7>[  136.677522] [drm:intel_atomic_commit_tail [i915]] [CRTC:51:pipe B]
<7>[  136.677600] [drm:verify_single_dpll_state.isra.79 [i915]] WRPLL 1

pstore overwritten by ftrace
Comment 7 Marta Löfstedt 2018-03-20 06:40:58 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_3/fi-bdw-5557u/igt@gem_ctx_isolation@bcs0-s3.html

run.log:
pass: igt/gem_ctx_isolation/bcs0-s3

[15/97] skip: 8, pass: 7 -
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_3/fi-bdw-5557u/34 : FAILURE
CI_IGT_test runtime 240 seconds
Rebooting fi-bdw-5557u

last dmesg:
<4>[   40.493107] Setting dangerous option reset - tainting kernel
<7>[   40.497149] [IGT] gem_ctx_isolation: starting subtest bcs0-S3
<6>[   40.567914] PM: suspend entry (deep)
Comment 8 Marta Löfstedt 2018-03-20 06:43:07 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_4/fi-bdw-gvtdvm/igt@gem_ctx_switch@basic-all-heavy.html

run.log:
running: igt/gem_ctx_switch/basic-all-heavy

[39/97] skip: 20, pass: 19 \               
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_4/fi-bdw-gvtdvm/25 : FAILURE
CI_IGT_test runtime 600 seconds
Rebooting fi-bdw-gvtdvm

Last dmesg:
<7>[   87.178419] [IGT] gem_ctx_switch: starting subtest basic-all-heavy
<6>[  186.052216] perf: interrupt took too long (8296 > 8205), lowering kernel.perf_event_max_sample_rate to 24000
<6>[  346.791638] perf: interrupt took too long (10381 > 10370), lowering kernel.perf_event_max_sample_rate to 19000
Comment 9 Marta Löfstedt 2018-03-20 06:44:28 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_3/fi-bdw-gvtdvm/igt@gem_exec_reuse@contexts.html

run.log:
running: igt/gem_exec_reuse/contexts

[06/97] skip: 2, pass: 3, fail: 1 - 
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_3/fi-bdw-gvtdvm/11 : FAILURE
CI_IGT_test runtime 462 seconds
Rebooting fi-bdw-gvtdvm

Last dmesg:
<7>[   47.105282] [IGT] gem_exec_reuse: executing
<4>[   47.156940] Setting dangerous option reset - tainting kernel
<6>[   47.222774] perf: interrupt took too long (5310 > 5207), lowering kernel.perf_event_max_sample_rate to 37000
<7>[   48.749247] [IGT] gem_exec_reuse: starting subtest contexts
<6>[   56.215270] perf: interrupt took too long (6707 > 6637), lowering kernel.perf_event_max_sample_rate to 29000
<6>[   67.716713] perf: interrupt took too long (8462 > 8383), lowering kernel.perf_event_max_sample_rate to 23000
Comment 10 Jani Saarinen 2018-05-09 14:37:56 UTC
I guess can be closed?
Comment 11 Jani Saarinen 2018-05-09 14:39:52 UTC
Not seen on CI lately.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.