CI_DRM_3265 GLK-shards igt@gem_softpin@noreloc-S3 https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3265/shard-glkb3/igt@gem_softpin@noreloc-S3.html Note bootlog start at: <7>[ 548.719874] [drm:edp_panel_vdd_off_sync [i915]] Turning eDP port A VDD off i.e there has been at least one shard of tests run before this on on this boot. last dmesg: <6>[ 1005.896525] PM: suspend entry (deep) run.log: [13/73] skip: 5, pass: 7, dmesg-warn: 1 / FATAL: command execution failed java.io.EOFException there is: [ 2603.510504] vblank not available on crtc 0, ret=-22 [ 2603.510562] ------------[ cut here ]------------ [ 2603.510585] WARNING: CPU: 2 PID: 5328 at drivers/gpu/drm/drm_vblank.c:1042 drm_wait_one_vblank+0x160/0x1b0 in this dmesg.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3267/shard-glkb4/igt@kms_frontbuffer_tracking@psr-shrfb-scaledprimary.html The backtrace has been cut-off: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3267/shard-glkb4/dmesg-1508445725_Oops_1.log last dmesg: <7>[ 2884.501987] [IGT] kms_plane_multiple: exiting, ret=77 <7>[ 2884.658294] [IGT] kms_frontbuffer_tracking: executing then stray. run.log: [22/72] skip: 5, pass: 15, fail: 2 - FATAL: command execution failed java.io.EOFException dmesg start at: <7>[ 2753.040777] [IGT] kms_frontbuffer_tracking: executing so other shards have been run on this machine before.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3268/shard-glkb1/igt@gem_eio@in-flight-contexts.html last demesg followed by stray: <7>[ 462.664336] [drm:gen8_init_common_ring [i915]] Execlists enabled for vcs0 <7>[ 462.664534] [drm:gen8_init_common_ring [i915]] Execlists enabled for vecs0 <4>[ 462.689948] Setting dangerous option reset - tainting kernel otherwise no *ERROR* in dmesg run-log: [40/72] skip: 18, pass: 16, fail: 6 | FATAL: command execution failed java.io.EOFException
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3275/shard-glkb3/igt@gem_ppgtt@blt-vs-render-ctx0.html last dmesg: <7>[ 1175.713017] [IGT] gem_ppgtt: executing <7>[ 1175.737657] [IGT] gem_ppgtt: starting subtest blt-vs-render-ctx0 <6>[ 1256.231064] perf: interrupt took too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 run.log: [33/73] skip: 11, pass: 19, fail: 3 / FATAL: command execution failed java.io.EOFException
CI_DRM_3286 shard-glkb1 igt@gem_cpu_reloc@full First dmesg: <5>[ 1648.677701] owatch: Using watchdog device /dev/watchdog0 Last dmesg: <7>[ 2012.063887] [IGT] gem_cpu_reloc: starting subtest full run.log: [33/73] skip: 11, pass: 20, fail: 2 / FATAL: command execution failed java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638) https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3286/shard-glkb1/igt@gem_cpu_reloc@full.html
CI_DRM_3286 shard-glkb2 igt@gem_tiled_partial_pwrite_pread@reads First dmesg: <5>[ 806.330960] owatch: Using watchdog device /dev/watchdog0 Last dmesg: <7>[ 997.929771] [IGT] gem_tiled_partial_pwrite_pread: executing [00/73] | FATAL: command execution failed java.io.EOFException https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3286/shard-glkb2/igt@gem_tiled_partial_pwrite_pread@reads.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3289/shard-glkb1/igt@gem_ringfill@basic-default-hang.html First dmesg: <5>[ 3379.148765] owatch: Using watchdog device /dev/watchdog0 Last dmesg: <7>[ 3694.039373] [IGT] gem_ringfill: executing Followed by "stay". runtimes log: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3289/shard-glkb1/runtimes31.txt => ~5 minutes Note dmesg timestamps are not wall time, the runtime should be. Also, Arek just fixed this feature with runtime just for the tests in each shard, thanks Arek! run.log: [11/72] skip: 3, pass: 8 \ FATAL: command execution failed java.io.EOFException From this I draw the conclusion that this was a system hang.
I will also file incomplete timeout/system hang on fi-glk-dsi on this bug. However, note bug 103514 which is a full scenario where we start with wedged GPU and then end up in incomplete. This is not the case for this: CI_DRM_3305 fi-glk-dsi igt@gem_ctx_switch@basic-default-heavy last dmesg: <7>[ 75.283189] [IGT] gem_ctx_switch: executing Followed by stry. run.log: [031/289] skip: 9, pass: 22 \ FATAL: command execution failed java.io.EOFException CI_IGT_test runtime 175 seconds So, this looks like system hang. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3305/fi-glk-dsi/igt@gem_ctx_switch@basic-default-heavy.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3308/shard-glkb4/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt.html First dmesg: <5>[ 1593.174703] owatch: Using watchdog device /dev/watchdog0 <5>[ 1593.174783] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 1593.177277] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) Last dmesg: <7>[ 1979.031808] [drm:drm_helper_hpd_irq_event] [CONNECTOR:89:DP-1] status updated from disconnected to disconnected <7>[ 1979.183337] [IGT] kms_frontbuffer_tracking: executing Followed by stray. run.log: [55/73] skip: 12, pass: 38, dmesg-warn: 1, fail: 4 \ FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3308@shard-glkb4 : FAILURE CI_IGT_test runtime 463 seconds
CI_DRM_3312 fi-glk-dsi igt@gem_exec_flush@basic-wb-set-default dmesg: <5>[ 43.114273] owatch: Using watchdog device /dev/watchdog0 <5>[ 43.114505] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 43.116098] owatch: timeout for /dev/watchdog0 set to 100 (requested 100) ... <7>[ 229.819324] [IGT] gem_exec_flush: executing run.log [073/289] skip: 16, pass: 57 / FATAL: command execution failed java.io.EOFException .... Completed CI_IGT_test CI_DRM_3312@fi-glk-dsi : FAILURE CI_IGT_test runtime 286 seconds looks like system hang. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3312/fi-glk-dsi/igt@gem_exec_flush@basic-wb-set-default.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3328/shard-glkb3/igt@gem_exec_blt@cold-max.html dmesg: <5>[ 1760.361917] owatch: Using watchdog device /dev/watchdog0 <5>[ 1760.361996] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 1760.364687] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... no obvious *ERROR* ... <7>[ 1985.345189] [IGT] gem_exec_params: executing <7>[ 1985.372299] [IGT] gem_exec_params: starting subtest cliprects_ptr-dirt <7>[ 1985.372502] [IGT] gem_exec_params: exiting, ret=0 Followed by stray. run.log: running: igt/kms_flip/vblank-vs-modeset-suspend-interruptible [34/73] skip: 16, pass: 16, fail: 2 - FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3328@shard-glkb3 : FAILURE CI_IGT_test runtime 515 seconds
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3326/shard-glkb3/igt@gem_softpin@softpin.html dmesg: <5>[ 3641.153567] owatch: Using watchdog device /dev/watchdog0 <5>[ 3641.153647] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 3641.156151] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... no *ERROR*s ... <7>[ 3974.363963] [IGT] gem_readwrite: exiting, ret=0 <7>[ 3974.540413] [IGT] gem_softpin: executing <7>[ 3974.555885] [IGT] gem_softpin: starting subtest softpin run.log: running: igt/gem_softpin/softpin [60/72] skip: 15, pass: 43, fail: 2 | FATAL: command execution failed ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3326@shard-glkb3 : FAILURE CI_IGT_test runtime 514 seconds
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3334/shard-glkb1/igt@perf@polling.html dmesg: <5>[ 2456.779388] owatch: Using watchdog device /dev/watchdog0 <5>[ 2456.779456] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 2456.780498] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 2705.896434] [drm:intel_power_well_disable [i915]] disabling power well 2 <7>[ 2705.896455] [drm:drm_helper_hpd_irq_event] [CONNECTOR:89:DP-1] status updated from disconnected to disconnected <7>[ 2706.025517] [IGT] perf: executing run.log [04/72] skip: 1, pass: 2, fail: 1 | FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3334@shard-glkb1 : FAILURE CI_IGT_test runtime 339 seconds
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3339/shard-glkb2/igt@kms_frontbuffer_tracking@fbc-suspend.html dmesg: <5>[ 470.310793] owatch: Using watchdog device /dev/watchdog0 <5>[ 470.310870] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 470.312856] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 753.111566] [drm:intel_power_well_disable [i915]] disabling DC off <7>[ 753.111665] [drm:gen9_enable_dc5 [i915]] Enabling DC5 <7>[ 753.111759] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 01 run.log: [30/73] skip: 12, pass: 15, fail: 3 - FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3339@shard-glkb2 : FAILURE CI_IGT_test runtime 387 seconds
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3339/shard-glkb5/igt@kms_flip@vblank-vs-modeset-suspend-interruptible.html above is just weird, impossible to decide which pstore that belongs to this run, timestamstes indicates runtimes over 24 hours.
this is from same run as previous: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3339/shard-glkb5/igt@kms_busy@basic-flip-b.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3339/shard-glkb5/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3340/shard-glkb5/igt@pm_rpm@system-suspend.html dmesg: <5>[ 1627.977850] owatch: Using watchdog device /dev/watchdog0 <5>[ 1627.977929] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 1627.979299] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 1713.425491] [drm:hsw_power_well_disable [i915]] power well 1 forced on (bios:1 driver:0 kvmr:0 debug:0) <7>[ 1713.425583] [drm:bxt_enable_dc9 [i915]] Enabling DC9 <7>[ 1713.425623] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 08 <7>[ 1713.425686] [drm:intel_runtime_suspend [i915]] Device suspended run.log [25/72] skip: 6, pass: 19 / FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3340@shard-glkb5 : FAILURE CI_IGT_test runtime 441 seconds
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3341/shard-glkb5/igt@gem_workarounds@suspend-resume-fd.html dmesg: <5>[ 25.911268] owatch: Using watchdog device /dev/watchdog0 <5>[ 25.911583] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 25.912970] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 65.759686] [IGT] gem_workarounds: executing <7>[ 65.782123] [IGT] gem_workarounds: starting subtest suspend-resume-fd <6>[ 65.828745] PM: suspend entry (deep) run.log: [11/72] skip: 3, pass: 7, fail: 1 \ FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3341@shard-glkb5 : FAILURE CI_IGT_test runtime 548 seconds
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3342/shard-glkb5/igt@drv_suspend@fence-restore-untiled.html dmesg: <5>[ 27.357629] owatch: Using watchdog device /dev/watchdog0 <5>[ 27.357971] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 27.359511] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 147.392151] [IGT] gem_exec_blt: exiting, ret=0 <7>[ 147.504680] [IGT] drv_suspend: executing <7>[ 147.529732] [IGT] drv_suspend: starting subtest fence-restore-untiled run.log [19/72] skip: 8, pass: 11 \ Build timed out (after 17 minutes). Marking the build as aborted. FATAL: command execution failed java.io.IOException ... Completed CI_IGT_test CI_DRM_3342@shard-glkb5 : FAILURE CI_IGT_test runtime 1028 seconds Note: Jenkins timeout after 17 minutes. Last dmesg is at 147 seconds!
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3343/shard-glkb5/igt@kms_cursor_crc@cursor-256x256-suspend.html dmesg: <5>[ 469.226040] owatch: Using watchdog device /dev/watchdog0 <5>[ 469.226117] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 469.227135] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 722.630630] [IGT] kms_cursor_crc: starting subtest cursor-256x256-suspend <7>[ 722.630905] [drm:drm_mode_addfb2] [FB:98] <7>[ 722.645396] [drm:drm_mode_setcrtc] [CRTC:42:pipe A] <7>[ 722.645453] [drm:drm_mode_setcrtc] [CONNECTOR:76:eDP-1] <7>[ 722.661151] [drm:drm_mode_setcrtc] [CRTC:58:pipe B] <7>[ 722.661687] [drm:drm_mode_setcrtc] [CRTC:74:pipe C] Folloed by "stray". run.log: [66/72] skip: 24, pass: 37, fail: 5 - FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3343@shard-glkb5 : FAILURE CI_IGT_test runtime 709 seconds
Rising priority since it is BAT.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3344/shard-glkb6/igt@kms_atomic_transition@plane-all-transition-nonblocking-fencing.html dmesg: <5>[ 1949.061995] owatch: Using watchdog device /dev/watchdog0 <5>[ 1949.062289] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 1949.062784] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 2230.806594] [drm:intel_power_well_disable [i915]] disabling power well 2 <7>[ 2230.807064] [drm:drm_helper_hpd_irq_event] [CONNECTOR:89:DP-1] status updated from disconnected to disconnected <7>[ 2231.121060] [IGT] kms_atomic_transition: executing Followed by "stray" run.log is weird doesn't show that any tests has been executed: running: igt/drv_suspend/forcewake-hibernate [00/72] | FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3344@shard-glkb6 : FAILURE CI_IGT_test runtime 380 seconds So, it looks like Jenkins lost connection and rebooted the machine. Or, 380 seconds is suspiciously similar to 370s watchdog timeout set by owatch. So, if
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3344/shard-glkb5/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html dmesg: <5>[ 25.149724] owatch: Using watchdog device /dev/watchdog0 <5>[ 25.149958] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 25.151533] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 558.126780] [drm:verify_connector_state.isra.74 [i915]] [CONNECTOR:76:eDP-1] <7>[ 558.126860] [drm:intel_atomic_commit_tail [i915]] [CRTC:58:pipe B] <7>[ 558.126977] [drm:verify_single_dpll_state.isra.75 [i915]] PORT PLL A run.log: [72/73] skip: 18, pass: 53, fail: 1 | FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3344@shard-glkb5 : FAILURE CI_IGT_test runtime 848 seconds
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3346/shard-glkb5/igt@gem_exec_suspend@basic-s3.html dmesg: <5>[ 27.257860] owatch: Using watchdog device /dev/watchdog0 <5>[ 27.258450] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 27.262545] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 75.294604] [IGT] gem_exec_suspend: executing <4>[ 75.312907] Setting dangerous option reset - tainting kernel <7>[ 75.315159] [IGT] gem_exec_suspend: starting subtest basic-S3 Followed by "stray" run.log: [08/72] skip: 3, pass: 5 | FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3346@shard-glkb5 : FAILURE CI_IGT_test runtime 550 seconds
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3347/shard-glkb5/igt@gem_workarounds@suspend-resume.html dmesg: <5>[ 390.922353] owatch: Using watchdog device /dev/watchdog0 <5>[ 390.922432] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 390.923494] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 586.552858] [IGT] gem_workarounds: executing <7>[ 586.576935] [IGT] gem_workarounds: starting subtest suspend-resume <6>[ 586.622368] PM: suspend entry (deep) run.log: [26/73] skip: 6, pass: 18, fail: 2 - FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3347@shard-glkb5 : FAILURE CI_IGT_test runtime 485 seconds
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3348/shard-glkb5/igt@kms_cursor_crc@cursor-64x64-suspend.html dmesg: <5>[ 23.777582] owatch: Using watchdog device /dev/watchdog0 <5>[ 23.777805] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 23.779501] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 422.722911] [drm:drm_mode_setcrtc] [CONNECTOR:76:eDP-1] <7>[ 422.734843] [drm:drm_mode_setcrtc] [CRTC:58:pipe B] <7>[ 422.735662] [drm:drm_mode_setcrtc] [CRTC:74:pipe C] run.log: [64/73] skip: 20, pass: 39, fail: 5 | FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3348@shard-glkb5 : FAILURE CI_IGT_test runtime 847 seconds Rebooting shard-glkb5
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3349/shard-glkb6/igt@gem_exec_create@madvise.html dmesg: <5>[ 647.132764] owatch: Using watchdog device /dev/watchdog0 <5>[ 647.132840] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 647.133861] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 875.417682] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:89:DP-1] disconnected <7>[ 875.419898] [IGT] kms_color: exiting, ret=77 <7>[ 875.612360] [IGT] gem_exec_create: executing run.log [16/72] skip: 7, pass: 8, fail: 1 | FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3349@shard-glkb6 : FAILURE CI_IGT_test runtime 342 seconds Rebooting shard-glkb6
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3350/shard-glkb5/igt@kms_flip@vblank-vs-dpms-suspend-interruptible.html <5>[ 685.382002] owatch: Using watchdog device /dev/watchdog0 <5>[ 685.382082] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 685.383453] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 956.156049] [drm:gen9_enable_dc5 [i915]] Enabling DC5 <7>[ 956.156118] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 01 <7>[ 956.157209] [drm:intel_power_well_disable [i915]] disabling always-on run.log: [29/73] skip: 9, pass: 18, fail: 2 / Build timed out (after 17 minutes). Marking the build as aborted. FATAL: command execution failed java.io.IOException ... Completed CI_IGT_test CI_DRM_3350@shard-glkb5 : FAILURE CI_IGT_test runtime 1021 seconds Rebooting shard-glkb5
<5>[ 36.684057] owatch: Using watchdog device /dev/watchdog0 <5>[ 36.684332] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 36.686046] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 124.274454] [drm:intel_enable_ddi [i915]] Panel doesn't support DRRS <7>[ 124.291308] [drm:verify_connector_state.isra.74 [i915]] [CONNECTOR:76:eDP-1] <7>[ 124.291366] [drm:intel_atomic_commit_tail [i915]] [CRTC:74:pipe C] <7>[ 124.291479] [drm:verify_single_dpll_state.isra.75 [i915]] PORT PLL A run.log: [08/73] skip: 1, pass: 6, fail: 1 | FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3352@shard-glkb5 : FAILURE CI_IGT_test runtime 557 seconds Rebooting shard-glkb5
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3354/shard-glkb5/igt@drv_suspend@sysfs-reader.html dmesg: <5>[ 369.207913] owatch: Using watchdog device /dev/watchdog0 <5>[ 369.207980] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 369.209059] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 620.578814] [drm:intel_backlight_device_get_brightness [i915]] get backlight PWM = 96000 <7>[ 620.586526] [drm:intel_backlight_device_get_brightness [i915]] get backlight PWM = 96000 <6>[ 620.608421] PM: suspend entry (deep) run.log: running: igt/drv_suspend/sysfs-reader [39/73] skip: 16, pass: 21, fail: 2 \ FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3354@shard-glkb5 : FAILURE CI_IGT_test runtime 508 seconds Rebooting shard-glkb5
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3355/shard-glkb5/igt@kms_flip@vblank-vs-dpms-suspend.html dmesg: <5>[ 26.618706] owatch: Using watchdog device /dev/watchdog0 <5>[ 26.618929] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 26.621515] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 400.814160] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 01 <7>[ 400.814405] [drm:intel_dp_hpd_pulse [i915]] ignoring long hpd on eDP port A <7>[ 400.815308] [drm:intel_power_well_disable [i915]] disabling always-on run.log: running: igt/kms_flip/vblank-vs-dpms-suspend [32/73] skip: 12, pass: 19, fail: 1 | FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3355@shard-glkb5 : FAILURE CI_IGT_test runtime 847 seconds Rebooting shard-glkb5
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3357/fi-glk-dsi/igt@gem_sync@basic-each.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3360/shard-glkb5/igt@pm_rpm@system-suspend-execbuf.html dmesg: <5>[ 30.844398] owatch: Using watchdog device /dev/watchdog0 <5>[ 30.844728] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 30.846748] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 108.387979] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 08 <7>[ 108.388079] [drm:intel_runtime_suspend [i915]] Device suspended <6>[ 108.552651] PM: suspend entry (deep) run.log running: igt/pm_rpm/system-suspend-execbuf [20/73] skip: 5, pass: 14, fail: 1 | FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3360/shard-glkb5/21 : FAILURE CI_IGT_test runtime 550 seconds Rebooting shard-glkb5
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_3988/shard-glkb5/igt@drv_suspend@forcewake.html dmesg: <5>[ 25.000516] owatch: Using watchdog device /dev/watchdog0 <5>[ 25.000738] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 25.002445] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 164.159097] [IGT] drv_suspend: executing <7>[ 164.180418] [IGT] drv_suspend: starting subtest forcewake <6>[ 164.229837] PM: suspend entry (deep) run.log: running: igt/drv_suspend/forcewake [28/73] skip: 9, pass: 18, fail: 1 | FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3353@shard-glkb5 : FAILURE CI_IGT_test runtime 546 seconds Rebooting shard-glkb5
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_3990/shard-glkb5/igt@pm_rpm@system-suspend-modeset.html dmesg: <5>[ 26.165090] owatch: Using watchdog device /dev/watchdog0 <5>[ 26.165660] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 26.167442] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 210.371312] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 08 <7>[ 210.371447] [drm:intel_runtime_suspend [i915]] Device suspended <6>[ 210.553751] PM: suspend entry (deep) run.log: running: igt/pm_rpm/system-suspend-modeset [31/73] skip: 12, pass: 19 \ Build timed out (after 17 minutes). Marking the build as aborted. FATAL: command execution failed java.io.IOException
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_3993/fi-glk-dsi/igt@gem_ctx_basic.html dmesg: <5>[ 38.501306] owatch: Using watchdog device /dev/watchdog0 <5>[ 38.501789] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 38.503219] owatch: timeout for /dev/watchdog0 set to 100 (requested 100) ... <7>[ 69.732772] [IGT] gem_cs_tlb: executing <7>[ 69.760911] [IGT] gem_cs_tlb: starting subtest basic-default <7>[ 74.449651] [IGT] gem_cs_tlb: exiting, ret=0 Followed by "stray" run.log: running: igt/gem_ctx_basic [024/289] skip: 9, pass: 15 | FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3364/fi-glk-dsi/0 : FAILURE CI_IGT_test runtime 549 seconds Rebooting fi-glk-dsi
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3364/shard-glkb2/igt@kms_flip@blt-wf_vblank-vs-dpms.html dmesg: <5>[ 1956.055800] owatch: Using watchdog device /dev/watchdog0 <5>[ 1956.055899] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 1956.058593] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 2465.910711] [drm:intel_atomic_check [i915]] New voltage level calculated to be logical 13, actual 13 <7>[ 2465.910784] [drm:bxt_get_dpll [i915]] [CRTC:58:pipe B] using pre-allocated PORT PLL A <7>[ 2465.910848] [drm:intel_reference_shared_dpll [i915]] using PORT PLL A for pipe B Followed by "stray" run.log: running: igt/kms_flip/vblank-vs-modeset-suspend [15/72] skip: 4, pass: 10, fail: 1 \ FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3364/shard-glkb2/26 : FAILURE CI_IGT_test runtime 687 seconds Rebooting shard-glkb2
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_3997/shard-glkb6/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-shrfb-draw-pwrite.html dmesg: <5>[ 1874.323543] owatch: Using watchdog device /dev/watchdog0 <5>[ 1874.323619] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 1874.324665] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 2378.064206] [drm:intel_atomic_commit_tail [i915]] [CRTC:42:pipe A] <7>[ 2378.064339] [drm:verify_single_dpll_state.isra.75 [i915]] PORT PLL A <7>[ 2378.250481] [IGT] kms_frontbuffer_tracking: executing run.log: running: igt/drv_suspend/forcewake-hibernate [12/73] skip: 1, pass: 9, fail: 2 | FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3368/shard-glkb6/6 : FAILURE CI_IGT_test runtime 500 seconds Rebooting shard-glkb6 run.log is weird it thinks it is still running igt/drv_suspend/forcewake-hibernate, allthough dmesg and results notice we have started kms_frontbuffer_tracking.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3373/shard-glkb6/igt@kms_busy@basic-modeset-c.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3373/shard-glkb3/igt@gem_ppgtt@blt-vs-render-ctxn.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3374/shard-glkb6/igt@kms_flip@flip-vs-panning-vs-hang.html dmesg: <5>[ 2808.875155] owatch: Using watchdog device /dev/watchdog0 <5>[ 2808.875235] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 2808.876269] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 3076.548712] [drm:gen8_irq_handler [i915]] hotplug event received, stat 0x00000008, dig 0x1a001818, pins 0x00000010 <7>[ 3076.548831] [drm:intel_hpd_irq_handler [i915]] digital hpd port A - long <7>[ 3076.548935] [drm:intel_hpd_irq_handler [i915]] Re "<7>[ 3076.548935] [drm:intel_hpd_irq_handler [i915]] Re" has clearly been cut-off bug stopped execution. run.log: running: igt/kms_flip/vblank-vs-modeset-suspend [06/75] skip: 1, pass: 4, fail: 1 - FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3374/shard-glkb6/32 : FAILURE CI_IGT_test runtime 378 seconds Rebooting shard-glkb6
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4010/shard-glkb5/igt@kms_fbcon_fbt@fbc-suspend.html dmesg: <5>[20391.277009] owatch: Using watchdog device /dev/watchdog0 <5>[20391.277089] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[20391.278213] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[20413.415648] [drm:verify_connector_state.isra.75 [i915]] [CONNECTOR:77:eDP-1] <7>[20413.415736] [drm:intel_atomic_commit_tail [i915]] [CRTC:43:pipe A] <7>[20413.415985] [drm:verify_single_dpll_state.isra.76 [i915]] PORT PLL A run.log: running: igt/kms_fbcon_fbt/fbc-suspend [09/75] skip: 1, pass: 8 / FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3378/shard-glkb5/20 : FAILURE CI_IGT_test runtime 278 seconds Rebooting shard-glkb5
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3383/fi-glk-dsi/igt@gem_exec_gttfill@basic.html This is a straing-up glk-dsi incomplete no GPU hung before: dmesg: <5>[ 24.027629] owatch: Using watchdog device /dev/watchdog0 <5>[ 24.028503] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 24.031168] owatch: timeout for /dev/watchdog0 set to 100 (requested 100) ... <7>[ 205.013427] [IGT] gem_exec_flush: starting subtest basic-wb-rw-default <7>[ 210.403714] [IGT] gem_exec_flush: exiting, ret=0 <7>[ 210.693921] [IGT] gem_exec_flush: executing Followed by "stray" run.log: running: igt/gem_exec_gttfill/basic [065/289] skip: 7, pass: 58 / FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3383/fi-glk-dsi/0 : FAILURE CI_IGT_test runtime 252 seconds Rebooting fi-glk-dsi
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3389/fi-glk-dsi/igt@gem_exec_reloc@basic-write-read-active.html dmesg: <5>[ 31.994582] owatch: Using watchdog device /dev/watchdog0 <5>[ 31.994798] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 31.997674] owatch: timeout for /dev/watchdog0 set to 100 (requested 100) ... <7>[ 252.326077] [IGT] gem_exec_reloc: starting subtest basic-write-cpu-active <4>[ 252.439204] Setting dangerous option reset - tainting kernel <7>[ 252.439892] [IGT] gem_exec_reloc: exiting, ret=0 Followed by "stray" run.log: running: igt/gem_exec_reloc/basic-write-read-active [097/289] skip: 10, pass: 87 / FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3389/fi-glk-dsi/0 : FAILURE CI_IGT_test runtime 314 seconds Rebooting fi-glk-dsi
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3379/shard-glkb3/igt@kms_flip@render-flip-vs-panning-interruptible.html dmesg: <5>[ 37.400972] owatch: Using watchdog device /dev/watchdog0 <5>[ 37.401191] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 37.402407] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 437.056593] [drm:drm_mode_setcrtc] [CONNECTOR:77:eDP-1] <7>[ 437.058583] [drm:drm_mode_setcrtc] [CRTC:43:pipe A] <7>[ 437.058654] [drm:drm_mode_setcrtc] [CONNECTOR:77:eDP-1] Followed by "stray". run.log: running: igt/gem_softpin/noreloc-s3 [42/75] skip: 16, pass: 23, fail: 3 - FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3379/shard-glkb3/1 : FAILURE CI_IGT_test runtime 548 seconds Rebooting shard-glkb3
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4011/shard-glkb6/igt@kms_render@direct-render.html dmesg: <5>[ 41.747115] owatch: Using watchdog device /dev/watchdog0 <5>[ 41.747344] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 41.748600] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 480.693681] [drm:intel_panel_enable_backlight [i915]] pipe C <7>[ 480.693736] [drm:intel_panel_actually_set_backlight [i915]] set backlight PWM = 67764 <7>[ 480.693799] [drm:intel_enable_ddi [i915]] Panel doesn't support DRRS run.log: running: igt/kms_flip/vblank-vs-dpms-suspend-interruptible [17/75] skip: 6, pass: 9, fail: 2 / FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3380/shard-glkb6/10 : FAILURE CI_IGT_test runtime 558 seconds Rebooting shard-glkb6
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3385/shard-glkb6/igt@gem_exec_create@forked.html dmesg: <5>[ 992.009972] owatch: Using watchdog device /dev/watchdog0 <5>[ 992.010074] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 992.011400] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 1589.117158] [drm:intel_power_well_disable [i915]] disabling DC off <7>[ 1589.117198] [drm:gen9_enable_dc5 [i915]] Enabling DC5 <7>[ 1589.117264] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 01 run.log: running: igt/drv_suspend/fence-restore-tiled2untiled-hibernate [36/74] skip: 11, pass: 21, fail: 4 | FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3385/shard-glkb6/34 : FAILURE CI_IGT_test runtime 704 seconds Rebooting shard-glkb6
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3419/shard-glkb6/igt@gem_partial_pwrite_pread@write-uncached.html another system hang looking incomplete
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3442/shard-glkb6/igt@kms_atomic_transition@plane-all-transition.html dmesg: <5>[ 419.687780] owatch: Using watchdog device /dev/watchdog0 <5>[ 419.687938] owatch: Watchdog /dev/watchdog0 is a software watchdog <5>[ 419.689051] owatch: timeout for /dev/watchdog0 set to 370 (requested 370) ... <7>[ 815.647724] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 01 <7>[ 816.771154] [IGT] gem_exec_blt: exiting, ret=0 <7>[ 817.163235] [IGT] kms_atomic_transition: executing Followed by "stray" run.log: running: igt/gem_workarounds/hibernate-resume-fd [02/75] pass: 2 - FATAL: command execution failed java.io.EOFException ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3442/shard-glkb6/8 : FAILURE CI_IGT_test runtime 432 seconds Rebooting shard-glkb6 something weird is going on since the test reported doesn't match running test in run.log.
nothing obviously suspicious in dmesg, no timeout indication in run.log https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3451/shard-glkb6/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-fullscreen.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3451/shard-glkb5/igt@gem_eio@in-flight-suspend.html https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4032/shard-glkb6/igt@gem_pwrite_pread@display-pwrite-blt-gtt_mmap-performance.html https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4034/shard-glkb6/igt@pm_rpm@drm-resources-equal.html https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4036/shard-glkb6/igt@kms_cursor_legacy@pipe-a-forked-move.html
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4033/shard-glkb1/igt@kms_flip@flip-vs-modeset-vs-hang-interruptible.html weird looking run.log: [66/76] skip: 21, pass: 37, fail: 8 - Build timed out (after 17 minutes). Marking the build as aborted. pass: igt/kms_cursor_crc/cursor-256x256-suspend [67/76] skip: 21, pass: 38, fail: 8 - running: igt/gem_partial_pwrite_pread/writes-after-reads [67/76] skip: 21, pass: 38, fail: 8 \ Set build name. New build name is 'CI_DRM_3448/shard-glkb1/4' Variable with name 'BUILD_DISPLAY_NAME' already exists, current value: 'CI_DRM_3448/shard-glkb1/4', new value: 'CI_DRM_3448/shard-glkb1/4' SSH: Connecting from host [glkb1] SSH: Connecting with configuration [archive] ... pass: igt/gem_partial_pwrite_pread/writes-after-reads [68/76] skip: 21, pass: 39, fail: 8 \ running: igt/kms_flip/flip-vs-modeset-vs-hang-interruptible [68/76] skip: 21, pass: 39, fail: 8 | SSH: Disconnecting configuration [archive] ... SSH: Transferred 75 file(s) Build was aborted Notifying upstream projects of job completion Finished: ABORTED Completed CI_IGT_test CI_DRM_3448/shard-glkb1/4 : ABORTED CI_IGT_test runtime 1084 seconds Rebooting shard-glkb1 from dmesg: <6>[18726.272872] r8169 0000:01:00.0 enp1s0: link down
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3460/shard-glkb6/igt@sw_sync@sync_expired_merge.html last dmesg: <7>[ 446.079842] [drm:intel_power_well_disable [i915]] disabling AUX A <7>[ 446.079875] [drm:intel_power_well_disable [i915]] disabling DC off <7>[ 446.079908] [drm:gen9_enable_dc5 [i915]] Enabling DC5 <7>[ 446.079938] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 01 run.log: running: igt/gem_workarounds/hibernate-resume-context [19/75] skip: 7, pass: 11, fail: 1 \ FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3460/shard-glkb6/29 : FAILURE CI_IGT_test runtime 552 seconds Rebooting shard-glkb6
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3469/shard-glkb6/igt@gem_eio@execbuf.html last demsg: <7>[ 528.567673] [drm:drm_mode_setcrtc] [CONNECTOR:77:eDP-1] <7>[ 528.631514] [IGT] kms_frontbuffer_tracking: starting subtest fbc-2p-primscrn-indfb-pgflip-blt <7>[ 528.648163] [IGT] kms_frontbuffer_tracking: exiting, ret=77 run.log: running: igt/gem_workarounds/hibernate-resume-fd [49/75] skip: 18, pass: 30, fail: 1 / FATAL: command execution failed ... Completed CI_IGT_test CI_DRM_3469/shard-glkb6/12 : FAILURE CI_IGT_test runtime 546 seconds Rebooting shard-glkb6
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3470/shard-glkb6/igt@kms_flip@absolute-wf_vblank.html Note that last entry was on gem_eio@execbug last dmesg: <7>[ 527.181131] [IGT] gem_eio: starting subtest execbuf ... <7>[ 543.973627] [IGT] gem_eio: exiting, ret=0 <4>[ 543.973798] Setting dangerous option reset - tainting kernel <7>[ 543.974365] [drm:i915_reset_device [i915]] resetting chip <5>[ 543.974467] i915 0000:00:02.0: Resetting chip after gpu hang <7>[ 543.975164] [drm:gen8_init_common_ring [i915]] Execlists enabled for rcs0 <7>[ 543.975237] [drm:init_workarounds_ring [i915]] rcs0: Number of context specific w/a: 11 <7>[ 543.975366] [drm:gen8_init_common_ring [i915]] Execlists enabled for bcs0 <7>[ 543.975497] [drm:gen8_init_common_ring [i915]] Execlists enabled for vcs0 <7>[ 543.975625] [drm:gen8_init_common_ring [i915]] Execlists enabled for vecs0 <7>[ 544.193668] [IGT] kms_flip: executing run.log: running: igt/gem_workarounds/hibernate-resume-fd [49/75] skip: 18, pass: 30, fail: 1 / FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3470/shard-glkb6/12 : FAILURE CI_IGT_test runtime 549 seconds Rebooting shard-glkb6
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3475/shard-glkb6/igt@gem_eio@throttle.html last dmesg: <7>[ 574.047845] [drm:verify_single_dpll_state.isra.78 [i915]] PORT PLL A <7>[ 574.047945] [drm:intel_power_well_disable [i915]] disabling power well 2 <7>[ 574.048018] [drm:intel_atomic_commit_tail [i915]] [CRTC:75:pipe C] run.log: running: igt/drv_suspend/fence-restore-tiled2untiled-hibernate [03/76] skip: 1, pass: 1, fail: 1 \ FATAL: command execution failed ... Finished: FAILURE Completed CI_IGT_test CI_DRM_3475/shard-glkb6/3 : FAILURE CI_IGT_test runtime 561 seconds Rebooting shard-glkb6 mismatch between affected test between run.log and results, indicating Jenkins lost contact with the system and rebooted.
From dmesg in: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3475/shard-glkb6/igt@gem_eio@throttle.html According to run.log the last test started here: <7>[ 61.411009] [IGT] drv_suspend: starting subtest fence-restore-tiled2untiled-hibernate ... <6>[ 63.869622] ACPI: Waking up from system sleep state S4 ... <3>[ 64.120472] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25). then there is a lot of scattered spamming like: r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25). ... <3>[ 80.713622] r8169 0000:01:00.0 enp1s0: rtl_phy_reset_cond == 1 (loop: 100, delay: 1). ... <6>[ 80.887584] r8169 0000:01:00.0 enp1s0: link up ... Then there is no more spamming from r8169. So the speculation is that networking went down after the hibernation, but why would Jenkins wait until last dmesg at: <7>[ 574.048018] [drm:intel_atomic_commit_tail [i915]] [CRTC:75:pipe C] to force reboot?
from: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3470/shard-glkb6/igt@kms_flip@absolute-wf_vblank.html run.log also has a hibernation test as the last one: running: igt/gem_workarounds/hibernate-resume-fd <6>[ 218.861995] r8169 0000:01:00.0 enp1s0: link down ... <6>[ 221.667729] r8169 0000:01:00.0 enp1s0: link up ... <7>[ 244.068400] [IGT] gem_workarounds: starting subtest hibernate-resume-fd ... <6>[ 246.353466] ACPI: Waking up from system sleep state S4 ... <3>[ 246.687298] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25). ... then there is spamming of the same r8169 until: <3>[ 432.740403] r8169 0000:01:00.0 enp1s0: rtl_phy_reset_cond == 1 (loop: 100, delay: 1). ... <6>[ 432.901671] r8169 0000:01:00.0 enp1s0: link up then again quite long time until last dmesg: <7>[ 544.193668] [IGT] kms_flip: executing
this also has same pattern: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3469/shard-glkb6/igt@gem_eio@execbuf.html last test in run.log is a hibernate test: <7>[ 238.733881] [IGT] gem_workarounds: starting subtest hibernate-resume-fd ... <3>[ 241.264817] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25). ... <6>[ 241.759128] r8169 0000:01:00.0 enp1s0: link up but here too it takes very long time for Jenkins to react <7>[ 528.648163] [IGT] kms_frontbuffer_tracking: exiting, ret=77
From IRC: <imre> marta_: looks like that r8169 message is an actual timeout error during IO and is silently converted to be a nop in case of IO write, or return ~0 in case of IO read. So definitely worth filing a bug for it. I filed: https://bugzilla.kernel.org/show_bug.cgi?id=198133
Here 2 more, with a hibernate test followed by spamming from r8169 and then incomplete where run.log doesn't match results: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3487/shard-glkb6/igt@kms_atomic_interruptible@legacy-cursor.html https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4050/shard-glkb6/igt@kms_cursor_crc@cursor-256x256-sliding.html
This doesn't match the hibernate/r8169 issue: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4050/shard-glkb4/igt@kms_cursor_legacy@pipe-c-forked-move.html last dmesg: <7>[ 113.613663] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:90:DP-1] disconnected <7>[ 113.615371] [IGT] kms_color: exiting, ret=77 <7>[ 113.716791] [IGT] kms_cursor_legacy: executing run.log: running: igt/kms_cursor_legacy/pipe-c-forked-move [15/76] skip: 5, pass: 10 \ FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3483/shard-glkb4/11 : FAILURE CI_IGT_test runtime 168 seconds Rebooting shard-glkb4
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4060/shard-glkb2/igt@kms_flip@flip-vs-panning-vs-hang-interruptible.html NOTE the pstores here doesn't match the run see bug 103706. Also, note last test in run.log doesn't match results: running: igt/kms_flip/vblank-vs-modeset-suspend from dmesg: <7>[ 404.526821] [IGT] kms_flip: starting subtest vblank-vs-modeset-suspend ... <7>[ 740.607775] [drm:intel_power_well_disable [i915]] disabling power well 2 <7>[ 740.607819] [drm:drm_helper_hpd_irq_event] [CONNECTOR:90:DP-1] status updated from disconnected to disconnected <7>[ 740.766938] [IGT] kms_flip: executing the r8169 logs aren't there however there are multiple <6>[ 408.191487] r8169 0000:01:00.0 enp1s0: link down <6>[ 411.300633] r8169 0000:01:00.0 enp1s0: link down <6>[ 414.563973] r8169 0000:01:00.0 enp1s0: link down <6>[ 420.674015] r8169 0000:01:00.0 enp1s0: link down <6>[ 423.890792] r8169 0000:01:00.0 enp1s0: link down before the next: <6>[ 426.686089] r8169 0000:01:00.0 enp1s0: link up
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3504/shard-glkb2/igt@gem_exec_flush@basic-uc-set-default.html Note this one also has vblank-vs-dpms-suspend as last in run.log running: igt/kms_flip/vblank-vs-dpms-suspend From dmesg: <7>[ 123.290625] [IGT] kms_flip: starting subtest vblank-vs-dpms-suspend ... <7>[ 469.167265] [IGT] gem_exec_parallel: starting subtest bsd1 <7>[ 469.170659] [IGT] gem_exec_parallel: exiting, ret=77 this also has repeated r8169 0000:01:00.0 enp1s0: link down before link up ~20 seconds later
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3511/shard-glkb5/igt@drv_suspend@fence-restore-tiled2untiled.html run.log: running: igt/drv_suspend/fence-restore-tiled2untiled [10/76] skip: 5, pass: 5 - FATAL: command execution failed java.io.EOFException ... Completed CI_IGT_test CI_DRM_3511/shard-glkb5/10 : FAILURE CI_IGT_test runtime 1022 seconds Rebooting shard-glkb5 look at very long runtime it is > 17 minutes, why is there no 17 minutes message in run.log from Jenkins? last dmesg: <7>[ 67.908753] [IGT] kms_atomic_transition: starting subtest 4x-modeset-transitions-nonblocking-fencing <7>[ 67.908906] [IGT] kms_atomic_transition: exiting, ret=77 <7>[ 68.020692] [IGT] drv_suspend: executing <7>[ 68.053358] [IGT] drv_suspend: starting subtest fence-restore-tiled2untiled only 68 seconds of dmesg! so my guess is that machine didn't come up after suspend, Jenkins lost contact and then maybe the 17 minutes is not triggering external reboot.
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4065/shard-glkb6/igt@gem_exec_nop@signal-all.html run.log doesn't match results: running: igt/drv_suspend/fence-restore-tiled2untiled-hibernate [17/76] skip: 5, pass: 11, fail: 1 / FATAL: command execution failed from dmesg: <3>[ 6580.916707] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25). ... <7>[ 6821.707774] [IGT] gem_exec_nop: executing <4>[ 6821.736323] Setting dangerous option reset - tainting kernel <7>[ 6821.738213] [IGT] gem_exec_nop: starting subtest signal-all <7>[ 6824.273266] [drm:edp_panel_vdd_off_sync [i915]] Turning eDP port A VDD off <7>[ 6824.273340] [drm:edp_panel_vdd_off_sync [i915]] PP_STATUS: 0x80000008 PP_CONTROL: 0x00000067 <7>[ 6824.273380] [drm:intel_power_well_disable [i915]] disabling AUX A
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3536/shard-glkb6/igt@gem_sync@basic-store-all.html <3>[ 124.451973] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25).
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3543/shard-glkb6/igt@kms_busy@extended-modeset-hang-newfb-render-c.html dmesg: <3>[ 147.150675] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25).
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4100/shard-glkb6/igt@kms_cursor_legacy@all-pipes-torture-bo.html dmesg: <3>[ 282.485282] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25).
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3596/shard-glkb6/igt@gem_partial_pwrite_pread@write-snoop.html run.log doesn't match tests dmesg: running: igt/gem_workarounds/hibernate-resume-fd [11/77] skip: 3, pass: 7, fail: 1 \ FATAL: command execution failed dmesg: <3>[ 56.196118] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25).
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3602/shard-glkb6/igt@kms_cursor_legacy@pipe-a-torture-bo.html last test in run.log: running: igt/drv_suspend/forcewake-hibernate from dmesg: <3>[ 236.894770] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25).
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3621/shard-glkb4/igt@kms_atomic_transition@plane-all-modeset-transition-fencing-internal-panels.html Weird the run.log is cut-off: [44/76] skip: 13, pass: 26, fail: 5 \ running: igt/kms_atomic_transition/plane-all-modeset-transition-fencing-internal-panels [44/76] skip: 13, pass: 26, fail: 5 | Maybe Jenkins issue
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4132/shard-glkb6/igt@gem_exec_suspend@basic-s4-devices.html last test in run.log: running: igt/gem_workarounds/hibernate-resume-context from dmesg: <7>[ 43.139336] [IGT] gem_workarounds: starting subtest hibernate-resume-context ... <6>[ 45.431230] r8169 0000:01:00.0: Refused to change power state, currently in D3 ... <3>[ 45.523188] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25). ... <7>[ 374.124906] [IGT] kms_flip: exiting, ret=77 <7>[ 374.359417] [IGT] gem_exec_suspend: executing
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3627/shard-glkb2/igt@gem_mmap_gtt@forked-big-copy-odd.html run.log: running: igt/kms_flip/vblank-vs-modeset-suspend [11/76] skip: 2, pass: 9 \ FATAL: command execution failed ... Completed CI_IGT_test CI_DRM_3627/shard-glkb2/26 : FAILURE CI_IGT_test runtime 553 seconds Rebooting shard-glkb2 from dmesg: <7>[ 61.868080] [IGT] kms_flip: starting subtest vblank-vs-modeset-suspend ... <7>[ 81.558306] [IGT] kms_flip: exiting, ret=0 ... <7>[ 491.406449] [IGT] gem_mmap_gtt: executing
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3630/shard-glkb6/igt@prime_vgem@wait-bsd1.html run.log doesn't match dmesg dmesg: <7>[ 95.111865] [IGT] drv_suspend: starting subtest fence-restore-tiled2untiled-hibernate ... <3>[ 97.778523] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25). ... <7>[ 555.970977] [IGT] gem_exec_nop: exiting, ret=0
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3652/shard-glkb2/igt@kms_flip@flip-vs-rmfb.html run.log: running: igt/kms_flip/flip-vs-rmfb [75/78] skip: 22, pass: 47, fail: 6 \ FATAL: command execution failed ... Completed CI_IGT_test CI_DRM_3652/shard-glkb2/6 : FAILURE CI_IGT_test runtime 548 seconds Rebooting shard-glkb2 no pstore. From dmesg: <7>[ 352.946787] [IGT] kms_flip: starting subtest flip-vs-rmfb ... this goes on for 10 seconds: <7>[ 354.387280] [drm:drm_mode_setcrtc] [CRTC:43:pipe A] <7>[ 354.387396] [drm:drm_mode_setcrtc] [CONNECTOR:77:eDP-1] <7>[ 354.404492] [drm:drm_mode_addfb2] [FB:97] ... <7>[ 364.087166] [drm:drm_mode_setcrtc] [CRTC:43:pipe A] <7>[ 364.087217] [drm:drm_mode_setcrtc] [CONNECTOR:77:eDP-1] <7>[ 364.104044] [drm:drm_mode_addfb2] [FB:100] I have seen that type of "spamming" before without going incomplete.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3657/shard-glkb6/igt@kms_flip@vblank-vs-modeset-rpm.html run.log doesn't match results: running: igt/gem_workarounds/hibernate-resume-context [40/78] skip: 16, pass: 22, fail: 2 | FATAL: command execution failed from dmesg: <7>[ 238.429987] [IGT] gem_workarounds: starting subtest hibernate-resume-context ... <3>[ 240.920300] r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25). ... <7>[ 381.753176] [drm:intel_power_well_disable [i915]] disabling dpio-common-c <7>[ 381.753213] [drm:intel_power_well_disable [i915]] disabling power well 2 <7>[ 381.753236] [drm:drm_helper_hpd_irq_event] [CONNECTOR:91:DP-1] status updated from disconnected to disconnected
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3671/shard-glkb6/igt@kms_cursor_crc@cursor-256x256-onscreen.html run.log: running: igt/gem_workarounds/hibernate-resume [26/78] skip: 9, pass: 13, fail: 4 - FATAL: command execution failed ... Completed CI_IGT_test CI_DRM_3671/shard-glkb6/3 : FAILURE CI_IGT_test runtime 546 seconds Rebooting shard-glkb6 from dmesg: <7>[ 200.035409] [IGT] gem_workarounds: starting subtest hibernate-resume ... <3>[ 206.645977] r8169 0000:01:00.0 enp1s0: rtl_phy_reset_cond == 1 (loop: 100, delay: 1). ... <7>[ 555.768130] [drm:drm_helper_hpd_irq_event] [CONNECTOR:91:DP-1] status updated from disconnected to disconnected <7>[ 556.140234] [IGT] kms_cursor_crc: executing
Marta, is this bug only about the failures with the r8169 0000:01:00.0 enp1s0: rtl_phy_reset_cond == 1 error? In that case I would suggest closing this as NOTOURBUG. Until the r8169 fix the network adapter bug, we could switch to using another adapter.
(In reply to Imre Deak from comment #76) > Marta, is this bug only about the failures with the > > r8169 0000:01:00.0 enp1s0: rtl_phy_reset_cond == 1 > > error? In that case I would suggest closing this as NOTOURBUG. Until the > r8169 fix the network adapter bug, we could switch to using another adapter. I went through now the incompletes in the last 20 shard runs on GLK and found the following two issues: 1. On GLKB4 there is most likely a hard disk/filesystem corruption problem causing the hang. From [1]: <6>[ 216.832883] PM: suspend exit <7>[ 216.873312] [drm:drm_mode_addfb2] [FB:101] <7>[ 216.972244] [drm:drm_mode_setcrtc] [CRTC:75:pipe C] <7>[ 216.972310] [drm:drm_mode_setcrtc] [CONNECTOR:77:eDP-1] <3>[ 217.083301] atkbd serio0: Failed to enable keyboard on isa0060/serio0 <6>[ 219.270898] r8169 0000:01:00.0 enp1s0: link up <2>[ 219.275742] EXT4-fs error (device sda2): __ext4_get_inode_loc:4611: inode #13108593: block 52428919: comm dhclient: unable to read itable block <3>[ 219.276555] Buffer I/O error on dev sda2, logical block 0, lost sync page write <3>[ 219.276589] Aborting journal on device sda2-8. <3>[ 219.276688] Buffer I/O error on dev sda2, logical block 58228736, lost sync page write <3>[ 219.276718] JBD2: Error -5 detected when updating journal superblock for sda2-8. <0>[ 219.276735] Kernel panic - not syncing: EXT4-fs (device sda2): panic forced after error <0>[ 219.276735] <4>[ 219.276753] CPU: 2 PID: 1659 Comm: dhclient Tainted: G U 4.15.0-rc8-CI-CI_DRM_3664+ #1 <4>[ 219.276765] Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017 <4>[ 219.276776] Call Trace: <4>[ 219.276798] dump_stack+0x5f/0x86 <4>[ 219.276818] panic+0xcf/0x20d <4>[ 219.276839] ? ext4_handle_error.part.99+0x30/0xa0 <4>[ 219.276870] ext4_handle_error.part.99+0x91/0xa0 <4>[ 219.276885] __ext4_error_inode+0xbc/0x170 <4>[ 219.276930] __ext4_get_inode_loc+0x213/0x500 <4>[ 219.276957] ? ext4_dirty_inode+0x3e/0x60 <4>[ 219.276972] ext4_reserve_inode_write+0x47/0xb0 <4>[ 219.276992] ext4_mark_inode_dirty+0x63/0x270 <4>[ 219.277006] ? rcu_read_lock_sched_held+0x6f/0x80 <4>[ 219.277033] ext4_dirty_inode+0x3e/0x60 <4>[ 219.277050] __mark_inode_dirty+0x67/0x690 <4>[ 219.277074] generic_update_time+0x73/0xc0 <4>[ 219.277089] ? current_time+0x2d/0x60 <4>[ 219.277110] touch_atime+0x88/0xa0 <4>[ 219.277135] ext4_file_mmap+0x4f/0x60 <4>[ 219.277151] mmap_region+0x385/0x5c0 <4>[ 219.277184] do_mmap+0x3ad/0x520 <4>[ 219.277212] vm_mmap_pgoff+0x94/0xd0 <4>[ 219.277247] SyS_mmap_pgoff+0x166/0x200 <4>[ 219.277276] entry_SYSCALL_64_fastpath+0x22/0x8f <4>[ 219.277289] RIP: 0033:0x7fc8d14bcbda <4>[ 219.277300] RSP: 002b:00007ffe6cfa4ac8 EFLAGS: 00000246 All the other suspend failures on GLKB4 end with <6>[ 702.905785] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) <6>[ 702.907826] ata1.00: configured for UDMA/133 <6>[ 702.918738] ata1.00: Enabling discard_zeroes_data which could confirm the same guess. Solution would be to replace the hard disk/fix the filesystem corruption. [1] https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3664/shard-glkb4/pstore6-1516440822_Panic_1.log 2. The following R8169 errors on (at least) GLKB6: r8169 0000:01:00.0 enp1s0: rtl_ocp_gphy_cond == 1 (loop: 10, delay: 25). Solution here would be to use another network adapter. It's possible I'm wrong and the hang is due to some other reason, but I'd like to see first the above 2 issues fixed. Adding Tomi.
I've replaced the SATA/power cable on shard-glkb4. It felt a bit flimsy, so it might have contributed to the seemingly disk issue. I'll change the disk too if disk-related errors pop up again.
(In reply to Imre Deak from comment #76) > Marta, is this bug only about the failures with the > > r8169 0000:01:00.0 enp1s0: rtl_phy_reset_cond == 1 > > error? In that case I would suggest closing this as NOTOURBUG. Until the > r8169 fix the network adapter bug, we could switch to using another adapter. No, this bug is for generic GLK incompletes that don't have any proven reason for. We have a hypothesis that some of the instances are related to the r8169 suspect logs, but we have not proven that. I.e. this bug will remain open.
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4176/shard-glkb2/igt@kms_plane_scaling@pipe-c-plane-scaling.html run.log doesn't match results. I see no obvious r8169 error prints.
(In reply to Imre Deak from comment #76) > Marta, is this bug only about the failures with the > > r8169 0000:01:00.0 enp1s0: rtl_phy_reset_cond == 1 > > error? In that case I would suggest closing this as NOTOURBUG. Until the > r8169 fix the network adapter bug, we could switch to using another adapter. Also, Imre when we get real GLK machines, this and other GLKB-shards related bugs, will be archived.
(In reply to Marta Löfstedt from comment #80) > https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4176/shard-glkb2/ > igt@kms_plane_scaling@pipe-c-plane-scaling.html > > run.log doesn't match results. I see no obvious r8169 error prints. In fact this is quite clearly a network error: After igt/gem_softpin/noreloc-s3 there is no response received by the testrunner: [49/80] skip: 17, pass: 29, fail: 3 | running: igt/gem_softpin/noreloc-s3 [49/80] skip: 17, pass: 29, fail: 3 / FATAL: command execution failed java.io.EOFException After which the testrunner will reboot the machine: Finished: FAILURE Completed CI_IGT_test CI_DRM_3683/shard-glkb2/2 : FAILURE CI_IGT_test runtime 551 seconds Rebooting shard-glkb2 While tests are still running on the machine up to kms_plane_scaling, where probably the above rebooting happens. Note that the last dmesg timestamp may not match what you'd expect based on the timeout, since the dmesg time doesn't account for the durations for system suspend/resumes.
(In reply to Marta Löfstedt from comment #79) > (In reply to Imre Deak from comment #76) > > Marta, is this bug only about the failures with the > > > > r8169 0000:01:00.0 enp1s0: rtl_phy_reset_cond == 1 > > > > error? In that case I would suggest closing this as NOTOURBUG. Until the > > r8169 fix the network adapter bug, we could switch to using another adapter. > > No, this bug is for generic GLK incompletes that don't have any proven > reason for. We have a hypothesis that some of the instances are related to > the r8169 suspect logs, but we have not proven that. I.e. this bug will > remain open. (In reply to Marta Löfstedt from comment #81) > > Also, Imre when we get real GLK machines, this and other GLKB-shards related > bugs, will be archived. There is pretty good evidence what is the problem and we could act on them now, instead of just waiting for those new machines to arrive.
This is a Meta bug for incompletes on GLK
(In reply to Imre Deak from comment #82) > (In reply to Marta Löfstedt from comment #80) > > https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4176/shard-glkb2/ > > igt@kms_plane_scaling@pipe-c-plane-scaling.html > > > > run.log doesn't match results. I see no obvious r8169 error prints. > > In fact this is quite clearly a network error: > After igt/gem_softpin/noreloc-s3 there is no response received by the > testrunner: > > [49/80] skip: 17, pass: 29, fail: 3 | > running: igt/gem_softpin/noreloc-s3 > > [49/80] skip: 17, pass: 29, fail: 3 / > FATAL: command execution failed > java.io.EOFException > > After which the testrunner will reboot the machine: > > Finished: FAILURE > Completed CI_IGT_test CI_DRM_3683/shard-glkb2/2 : FAILURE > CI_IGT_test runtime 551 seconds > Rebooting shard-glkb2 > > > While tests are still running on the machine up to kms_plane_scaling, where > probably the above rebooting happens. Note that the last dmesg timestamp may > not match what you'd expect based on the timeout, since the dmesg time > doesn't account for the durations for system suspend/resumes. Could also be some other test causing problems or last test hanging the system, for an example see bug 104218. Since the time Jenkins take to reboot machine after loosing contact is unknown, we can only say that it is a network issue if there are clear indications in dmesg of that in the log.
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4180/shard-glkb2/igt@gem_userptr_blits@coherency-unsync.html last dmesg: <7>[46616.172859] [IGT] gem_userptr_blits: starting subtest coherency-unsync <7>[46616.531767] [IGT] gem_userptr_blits: exiting, ret=77
NOTE the r8169 issue is now handled in bug 104787. This bug is still the Meta bug for unexplained incompletes on GLKb
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4210/shard-glkb6/igt@perf_pmu@busy-double-start-vecs0.html this has the pattern for the r8169 issue see bug 104787. However, there is a pstore which is not expected for the r8169 issue. Unfortunately the pstore on has a short ftrace.
Note I removed impact from some test that has on the most recent hit been due to the r8169 issue, see bug 104787
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3956/shard-glkb1/igt@kms_atomic@plane_primary_legacy.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3967/fi-glk-j4005/igt@gem_exec_suspend@basic-s3.html run.log: running: igt/gem_exec_suspend/basic-s3 [107/285] skip: 11, pass: 95, dmesg-warn: 1 \ FATAL: command execution failed ... Completed CI_IGT_test CI_DRM_3967/fi-glk-j4005/0 : FAILURE CI_IGT_test runtime 417 seconds Rebooting fi-glk-j4005 Last dmesg: <4>[ 237.334279] Setting dangerous option reset - tainting kernel <7>[ 237.339608] [IGT] gem_exec_suspend: starting subtest basic-S3 <6>[ 238.700670] PM: suspend entry (deep)
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4412/shard-glkb4/igt@gem_exec_create@forked.html run.log: running: igt/kms_cursor_crc/cursor-256x256-suspend [34/76] skip: 8, pass: 23, fail: 3 - FATAL: command execution failed ... Completed CI_IGT_test CI_DRM_4026/shard-glkb4/4 : FAILURE CI_IGT_test runtime 543 seconds Rebooting shard-glkb4 Last dmesg: <7>[ 388.837388] [IGT] gem_exec_create: executing <4>[ 388.859978] Setting dangerous option reset - tainting kernel <7>[ 388.865072] [IGT] gem_exec_create: starting subtest forked <7>[ 388.865895] [drm:intel_power_well_enable [i915]] enabling DC off <7>[ 388.865961] [drm:gen9_set_dc_state [i915]] Setting DC state from 01 to 00 <7>[ 388.867055] [drm:intel_power_well_disable [i915]] disabling DC off <7>[ 388.867118] [drm:gen9_enable_dc5 [i915]] Enabling DC5 <7>[ 388.867164] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 01 <7>[ 388.895441] [drm:intel_power_well_enable [i915]] enabling DC off <7>[ 388.895537] [drm:gen9_set_dc_state [i915]] Setting DC state from 01 to 00
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_15/fi-glk-1/igt@perf_pmu@enable-race-vcs0.html last dmesg: <7>[ 107.254591] [drm:intel_dp_sink_crc_stop [i915]] TIMEOUT: Sink CRC counter is not zeroed after calculation is stopped <7>[ 107.604288] [drm:intel_dp_sink_crc_stop [i915]] TIMEOUT: Sink CRC counter is not zeroed after calculation is stopped <7>[ 107.621620] [IGT] kms_frontbuffer_tracking: exiting, ret=0 <4>[ 107.622349] Setting dangerous option enable_psr - tainting kernel <4>[ 107.622633] Setting dangerous option enable_fbc - tainting kernel <6>[ 107.653856] Console: switching to colour frame buffer device 240x67 <6>[ 107.795223] Console: switching to colour dummy device 80x25 <7>[ 107.795323] [IGT] perf_pmu: executing
Also seen on BAT: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1268/fi-glk-1/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html Nothing suspicious in dmesg.
Still happening, not sure why it was closed.
Setting to medium since it's a meta-bug of sorts.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/58.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.