Bug 104108 - [CI][SKL only] igt@* - incomplete - timeout/system hang?
Summary: [CI][SKL only] igt@* - incomplete - timeout/system hang?
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Francesco Balestrieri
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks: 105984
  Show dependency treegraph
 
Reported: 2017-12-05 13:24 UTC by Marta Löfstedt
Modified: 2018-12-09 18:42 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: GEM/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marta Löfstedt 2017-12-05 13:24:45 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3458/fi-skl-6260u/igt@gem_exec_suspend@basic-s3.html

last dmesg:
<4>[  215.690278] Setting dangerous option reset - tainting kernel
<7>[  215.691877] [IGT] gem_exec_suspend: starting subtest basic-S3
<6>[  216.863865] PM: suspend entry (deep)

run.log:
running: igt/gem_exec_suspend/basic-s3

[107/288] skip: 3, pass: 104 \        
FATAL: command execution failed
java.io.EOFException
...
Completed CI_IGT_test CI_DRM_3458/fi-skl-6260u/0 : FAILURE
CI_IGT_test runtime 551 seconds
Rebooting fi-skl-6260u
Comment 1 Marta Löfstedt 2017-12-05 13:26:50 UTC
Note there has been a big increase in the number of non obvious incompletes since 4.15.0-rc1, so this is potentially a collector bug for all these new occurrences on SKL.
Comment 2 Elizabeth 2017-12-05 15:40:30 UTC
Rising priority since BAT.
Comment 4 Marta Löfstedt 2017-12-11 07:46:16 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3489/fi-skl-gvtdvm/igt@gem_busy@basic-hang-default.html

dmesg:
<7>[   29.058808] [IGT] gem_busy: executing
<4>[   29.095607] Setting dangerous option reset - tainting kernel
<4>[   29.100201] Setting dangerous option reset - tainting kernel
<7>[   29.100260] [IGT] gem_busy: starting subtest basic-hang-default

run.log:
doesn't have any results. This indicates network issue and forced reboot from Jenkins.
Comment 5 Marta Löfstedt 2018-01-04 13:14:15 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4113/fi-skl-6770hq/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html

last dmesg:
<7>[  411.893036] [drm:drm_mode_addfb2] [FB:79]
<7>[  411.921265] [drm:drm_mode_setcrtc] [CRTC:57:pipe C]
<7>[  411.921297] [drm:drm_mode_setcrtc] [CONNECTOR:73:DP-3]

run.log:
running: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-a
[243/288] skip: 16, pass: 227 \                        
FATAL: command execution failed
java.io.EOFException
...
Completed CI_IGT_test CI_DRM_3596/fi-skl-6770hq/0 : FAILURE
CI_IGT_test runtime 596 seconds
Rebooting fi-skl-6770hq

pstore is just a bunch of:
 <0>[  492.322450] gem_sync-3944    3..s1 268534345us : execlists_submission_tasklet: vcs0 in[0]:  ctx=3.1, seqno=2fe4
Comment 6 Marta Löfstedt 2018-01-09 14:23:55 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3613/fi-skl-6700k2/igt@kms_chamelium@common-hpd-after-suspend.html

run.log:
running: igt/kms_chamelium/common-hpd-after-suspend

[206/288] skip: 16, pass: 189, dmesg-warn: 1 -     
FATAL: command execution failed
...
Completed CI_IGT_test CI_DRM_3613/fi-skl-6700k2/0 : FAILURE
CI_IGT_test runtime 551 seconds
Rebooting fi-skl-6700k2

last dmesg:
<7>[  298.032273] [drm:drm_mode_debug_printmodeline] Modeline 112:"640x480" 60 25175 640 656 752 800 480 490 492 525 0x40 0xa
<7>[  298.032274] [drm:drm_mode_debug_printmodeline] Modeline 113:"720x400" 70 28320 720 738 846 900 400 412 414 449 0x40 0x6
<6>[  302.621730] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Comment 7 Marta Löfstedt 2018-01-15 07:13:40 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3630/fi-skl-6600u/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html

run.log:
running: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-c

[245/288] skip: 24, pass: 221 /                        
FATAL: command execution failed
...
Completed CI_IGT_test CI_DRM_3630/fi-skl-6600u/0 : FAILURE
CI_IGT_test runtime 848 seconds
Rebooting fi-skl-6600u

from dmesg:
<7>[  406.874469] [IGT] kms_pipe_crc_basic: starting subtest suspend-read-crc-pipe-A
...
<7>[  417.320741] [IGT] kms_pipe_crc_basic: starting subtest suspend-read-crc-pipe-C
...
Comment 8 Marta Löfstedt 2018-01-25 12:46:19 UTC
This is a Metabug to capture all non obvious reason for incompletes on SKL
Comment 9 Marta Löfstedt 2018-02-13 07:57:19 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3755/fi-skl-6260u/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html

run.log:
[243/288] skip: 16, pass: 227 -
running: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-a
...
Completed CI_IGT_test CI_DRM_3755/fi-skl-6260u/0 : FAILURE
CI_IGT_test runtime 842 seconds
Rebooting fi-skl-6260u

last dmesg:
<7>[  364.222124] [drm:drm_mode_addfb2] [FB:84]
<7>[  364.232369] [drm:drm_mode_setcrtc] [CRTC:47:pipe B]
<7>[  364.232444] [drm:drm_mode_setcrtc] [CONNECTOR:59:HDMI-A-1]

big timegap between 842 seconds in run.log compared to dmesg. Also, dmesg has started running the next test: 
<7>[  364.055900] [IGT] kms_pipe_crc_basic: starting subtest suspend-read-crc-pipe-B
Comment 10 Marta Löfstedt 2018-03-16 08:13:44 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-skl-guc/igt@drv_suspend@forcewake.html

Last dmesg:
<7>[   98.247784] [IGT] drv_suspend: executing
<7>[   98.254459] [IGT] drv_suspend: starting subtest forcewake
Comment 11 Marta Löfstedt 2018-03-19 07:47:51 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_2/fi-skl-guc/igt@pm_rpm@system-suspend.html

run.log:
running: igt/pm_rpm/system-suspend

[56/97] skip: 23, pass: 33 |      
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_2/fi-skl-guc/26 : FAILURE
CI_IGT_test runtime 155 seconds
Rebooting fi-skl-guc

Last dmesg:
<7>[   95.079461] [drm:drm_mode_setcrtc] [CRTC:69:pipe C]
<7>[   95.079899] [drm:intel_runtime_suspend [i915]] Suspending device
<7>[   95.087116] [drm:intel_runtime_suspend [i915]] Device suspended
Comment 14 Marta Löfstedt 2018-03-20 08:59:05 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_3/fi-skl-6700k2/igt@pm_rpm@gem-execbuf-stress.html

run.log:
running: igt/pm_rpm/gem-execbuf-stress

[83/97] skip: 42, pass: 41 \          
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_3/fi-skl-6700k2/19 : FAILURE
CI_IGT_test runtime 543 seconds
Rebooting fi-skl-6700k2

from dmesg:
<7>[  308.801126] [IGT] pm_rpm: starting subtest gem-execbuf-stress
...
<7>[  321.101376] [drm:drm_dp_read_desc] DP branch: OUI 00-00-00 dev-ID  HW-rev 0.0 SW-rev 0.0 quirks 0x0000
<4>[  321.105864] i915 0000:00:02.0: DP-2: EDID is invalid:
<4>[  321.105873] 	[00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<4>[  321.105878] 	[00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<4>[  321.105883] 	[00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<4>[  321.105888] 	[00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<4>[  321.105892] 	[00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<4>[  321.105897] 	[00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<4>[  321.105902] 	[00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<4>[  321.105906] 	[00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<7>[  321.106329] [drm:drm_helper_hpd_irq_event] [CONNECTOR:88:DP-2] status updated from connected to connected
Comment 15 Marta Löfstedt 2018-03-23 11:39:43 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4380/fi-skl-guc/igt@gem_exec_suspend@basic-s4-devices.html

run.log:
pass: igt/gem_exec_suspend/basic-s4-devices

[109/285] skip: 11, pass: 98 |
FATAL: command execution failed
...
Completed CI_IGT_test CI_DRM_3970/fi-skl-guc/0 : FAILURE
CI_IGT_test runtime 353 seconds
Rebooting fi-skl-guc

Last dmesg:
<7>[  189.284331] [IGT] gem_exec_suspend: starting subtest basic-S4-devices
<6>[  189.778117] PM: hibernation entry
<6>[  189.778505] PM: Syncing filesystems ... 

pstore:
<0>[  286.226676] gem_exec-3154    7.... 189601617us : __i915_request_add: vcs0 fence 2781a:1009
...
<0>[  286.381195] gem_exec-3154    2.... 195965022us : i915_request_retire: vecs0(2050) fence 2781e:2048, global_seqno 2050

then just backtrace sysrq-trigger
Comment 16 Marta Löfstedt 2018-04-04 06:05:55 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_9/fi-skl-6700k2/igt@kms_cursor_crc@cursor-128x128-suspend.html

run.log:
running: igt/kms_cursor_crc/cursor-128x128-suspend

[90/98] skip: 43, pass: 47 -                      
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_9/fi-skl-6700k2/34 : FAILURE
CI_IGT_test runtime 441 seconds
Rebooting fi-skl-6700k2

Last dmesg:
<7>[  293.235418] [IGT] kms_cursor_crc: starting subtest cursor-128x128-suspend
...
<7>[  293.389918] [drm:intel_hpd_irq_handler [i915]] digital hpd port E - short
<7>[  293.390160] [drm:intel_dp_hpd_pulse [i915]] got hpd irq on port E - short
<7>[  293.390893] [drm:intel_dp_read_dpcd [i915]] DPCD: 11 0a 82 01 00 03 01 01 02 00 00 00 00 00 00

pstore would require exxamination of a gem person:
<0>[  369.661015] gem_exec-3672    2.... 193363829us : i915_request_retire: vecs0 fence 2525:16385, global_seqno 16385, current 16385
<0>[  369.661039] gem_exec-3672    2.... 193363842us : __i915_request_add: vecs0 fence 2525:16386
...
<0>[  369.668912] kworker/-1425    2.... 295817544us : i915_request_retire: vecs0 fence 1a:10, global_seqno 2, current 2
<0>[  369.668915] ---------------------------------
<0>[  369.668918] Kernel Offset: 0x15000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Comment 17 Marta Löfstedt 2018-04-04 06:09:46 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_11/fi-skl-6700k2/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html

run.log:
running: igt/kms_plane/plane-panning-bottom-right-suspend-pipe-b-planes

[51/98] skip: 19, pass: 32 \                                           
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_11/fi-skl-6700k2/31 : FAILURE
CI_IGT_test runtime 600 seconds
Rebooting fi-skl-6700k2

dmesg:
<7>[  231.028331] [IGT] kms_plane: starting subtest plane-panning-bottom-right-suspend-pipe-B-planes
...
<4>[  232.323828] i915 0000:00:02.0: DP-2: EDID is invalid:
...
<6>[  232.703440] PM: suspend exit
<7>[  232.741154] [drm:drm_mode_addfb2] [FB:105]
<7>[  232.755515] [drm:drm_mode_setcrtc] [CRTC:55:pipe B]
<7>[  232.755539] [drm:drm_mode_setcrtc] [CONNECTOR:71:HDMI-A-1]

pstore need examination of GEM person:
<0>[  308.291085] kworker/-1661    4.... 227927595us : i915_request_retire: rcs0 fence 5ae:167, global_seqno 10637, current 10661
...
<0>[  308.297870] kworker/-1608    3.... 234894287us : i915_request_retire: vecs0 fence 1a:24, global_seqno 2, current 2
<0>[  308.297874] ---------------------------------
<0>[  308.297877] Kernel Offset: 0x16000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Comment 18 Marta Löfstedt 2018-04-04 06:18:14 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_11/fi-skl-6260u/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes.html

run.log:
running: igt/kms_plane/plane-panning-bottom-right-suspend-pipe-c-planes

[95/98] skip: 52, pass: 43 \                                           
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_11/fi-skl-6260u/23 : FAILURE
CI_IGT_test runtime 844 seconds
Rebooting fi-skl-6260u

dmesg:
<7>[  492.786941] [IGT] kms_plane: starting subtest plane-panning-bottom-right-suspend-pipe-C-planes
...
<6>[  494.318310] PM: suspend exit
<7>[  494.358475] [drm:drm_mode_addfb2] [FB:114]
<7>[  494.384451] [drm:drm_mode_setcrtc] [CRTC:69:pipe C]
<7>[  494.384482] [drm:drm_mode_setcrtc] [CONNECTOR:71:HDMI-A-1]
Comment 25 Martin Peres 2018-05-03 14:16:07 UTC
Since these are mostly GEM tests, let's move it to GEM/Other.
Comment 28 Martin Peres 2018-05-28 09:48:09 UTC
Seems like igt@perf@ is problematic for fi-skl-6600u: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_37/fi-skl-6600u/igt@perf@buffer-fill.html
Comment 34 Martin Peres 2018-08-07 12:48:05 UTC
= fi-skl-6700hq =

== Killed by owatch ==

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_82/fi-skl-6700hq/igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-shrfb-draw-blt.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_92/fi-skl-6700hq/igt@kms_flip@flip-vs-panning.html


== Some pstore ==

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_92/fi-skl-6700hq/igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-indfb-draw-mmap-gtt.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_90/fi-skl-6700hq/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-mmap-gtt.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_92/fi-skl-6700hq/igt@gem_userptr_blits@process-exit-busy.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_92/fi-skl-6700hq/igt@gem_exec_store@pages-blt.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_92/fi-skl-6700hq/igt@kms_frontbuffer_tracking@fbcpsr-rgb565-draw-blt.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_84/fi-skl-6700hq/igt@kms_frontbuffer_tracking@fbcpsr-1p-offscren-pri-indfb-draw-mmap-wc.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_85/fi-skl-6700hq/igt@kms_busy@extended-pageflip-hang-newfb-render-b.html


== No pstore ==

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_83/fi-skl-6770hq/igt@gem_render_copy@y-tiled-ccs-to-y-tiled.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_89/fi-skl-6770hq/igt@gem_render_copy@y-tiled-ccs-to-linear.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_88/fi-skl-6770hq/igt@gem_render_copy@y-tiled-ccs-to-x-tiled.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_87/fi-skl-6770hq/igt@gem_render_copy@y-tiled-ccs-to-linear.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_86/fi-skl-6770hq/igt@perf@gen8-unprivileged-single-ctx-counters.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_89/fi-skl-6700k2/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-indfb-plflip-blt.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_92/fi-skl-6700hq/igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-shrfb-draw-render.html
Comment 56 Francesco Balestrieri 2018-11-23 10:59:13 UTC
Setting to medium. Nothing to do directly here, other than hoping other bug fixes will help.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.