Bug 105086 - [BAT] [CNL only] igt@* - incomplete - system hang?
Summary: [BAT] [CNL only] igt@* - incomplete - system hang?
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: low normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 104962 105058 (view as bug list)
Depends on:
Blocks: 105984
  Show dependency treegraph
 
Reported: 2018-02-14 07:24 UTC by Marta Löfstedt
Modified: 2018-11-01 16:58 UTC (History)
2 users (show)

See Also:
i915 platform: CNL
i915 features: display/Other, GEM/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marta Löfstedt 2018-02-14 07:24:22 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3765/fi-cnl-y3/igt@gem_cs_tlb@basic-default.html

run.log:
[014/288] pass: 14 /
running: igt/gem_cs_tlb/basic-default

[014/288] pass: 14 -                 
FATAL: command execution failed
...
Completed CI_IGT_test CI_DRM_3765/fi-cnl-y3/0 : FAILURE
CI_IGT_test runtime 542 seconds
Rebooting fi-cnl-y3

last dmesg:
<7>[   57.230249] [IGT] gem_cpu_reloc: executing
<7>[   57.241801] [IGT] gem_cpu_reloc: starting subtest basic
<7>[   57.250397] [IGT] gem_cpu_reloc: exiting, ret=0
<7>[   57.353654] [IGT] gem_cs_tlb: executing
Followed by stray.
Comment 1 Marta Löfstedt 2018-02-19 06:57:49 UTC
I make this bug into genereric incomplete bug for CNL.

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3793/fi-cnl-y3/igt@gem_ctx_switch@basic-default-heavy.html
Comment 2 Marta Löfstedt 2018-02-19 06:59:01 UTC
When running igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a on IGT_4245, the machine fi-cnl-y3 hard-hanged:

https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4245/fi-cnl-y3/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
Comment 3 Marta Löfstedt 2018-02-19 06:59:21 UTC
*** Bug 105058 has been marked as a duplicate of this bug. ***
Comment 4 Marta Löfstedt 2018-02-19 07:00:15 UTC
*** Bug 104962 has been marked as a duplicate of this bug. ***
Comment 5 Marta Löfstedt 2018-02-19 07:00:37 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4223/fi-cnl-y3/igt@kms_flip@basic-plain-flip.html

Last dmesg:
<7>[  388.469628] [drm:verify_single_dpll_state.isra.77 [i915]] DPLL 1
<7>[  388.469699] [drm:intel_atomic_commit_tail [i915]] [CRTC:75:pipe C]
<7>[  388.566095] [IGT] kms_flip: executing

Note bug 104593 is incomplete for fi-cnl-y2
Comment 6 Rodrigo Vivi 2018-02-28 23:17:33 UTC
Could we close this now based on cnl-y3 results?
Comment 7 Marta Löfstedt 2018-03-01 06:47:01 UTC
(In reply to Rodrigo Vivi from comment #6)
> Could we close this now based on cnl-y3 results?

No, cut-off time fo cibuglog bugs is 1 mount of no reproduction.
Comment 8 Marta Löfstedt 2018-03-12 12:21:42 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3903/fi-cnl-y3/igt@gem_ctx_isolation@vecs0-reset.html

run.log:
running: igt/gem_ctx_isolation/vecs0-reset

[65/98] skip: 29, pass: 34, fail: 2 /     
FATAL: command execution failed
...
Completed CI_IGT_test CI_DRM_3903/fi-cnl-y3/23 : FAILURE
CI_IGT_test runtime 843 seconds
Rebooting fi-cnl-y3

Last dmesg:
<7>[  529.085780] [drm:verify_single_dpll_state.isra.79 [i915]] DPLL 1
<6>[  529.119263] Console: switching to colour frame buffer device 480x135
<6>[  529.269811] Console: switching to colour dummy device 80x25
<7>[  529.269862] [IGT] gem_ctx_isolation: executing
Followed by stray
Comment 9 Marta Löfstedt 2018-03-12 12:22:49 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3903/fi-cnl-drrs/igt@tools_test@tools_test.html

run.log:
running: igt/tools_test/tools_test

[90/98] skip: 38, pass: 52 -      
FATAL: command execution failed
...
Completed CI_IGT_test CI_DRM_3903/fi-cnl-drrs/16 : FAILURE
CI_IGT_test runtime 843 seconds
Rebooting fi-cnl-drrs

Last dmesg:
<6>[  507.249692] Console: switching to colour frame buffer device 240x67
<6>[  507.375176] Console: switching to colour dummy device 80x25
<7>[  507.375224] [IGT] tools_test: executing
Comment 10 Marta Löfstedt 2018-03-14 11:47:24 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4355/fi-cnl-y3/igt@gem_exec_suspend@basic-s3.html

run.log:
running: igt/gem_exec_suspend/basic-s3

[107/288] skip: 11, pass: 96 \        
FATAL: command execution failed
...
Completed CI_IGT_test CI_DRM_3924/fi-cnl-y3/0 : FAILURE
CI_IGT_test runtime 543 seconds
Rebooting fi-cnl-y3

Last dmesg:
<7>[  220.189897] [IGT] gem_exec_suspend: executing
<4>[  220.193617] Setting dangerous option reset - tainting kernel
<7>[  220.196625] [IGT] gem_exec_suspend: starting subtest basic-S3
Comment 11 Marta Löfstedt 2018-03-15 15:19:03 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-cnl-y3/igt@kms_flip@flip-vs-blocking-wf-vblank.html

run.log:
running: igt/kms_flip/flip-vs-blocking-wf-vblank

[58/98] skip: 21, pass: 34, fail: 3 -           
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_1/fi-cnl-y3/5 : FAILURE
CI_IGT_test runtime 843 seconds
Rebooting fi-cnl-y3

Last dmesg:
<6>[  446.223922] Console: switching to colour dummy device 80x25
<7>[  446.223977] [IGT] kms_flip: executing
Comment 12 Marta Löfstedt 2018-03-15 15:26:50 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-cnl-y3/igt@kms_frontbuffer_tracking@fbcdrrs-1p-primscrn-spr-indfb-move.html

run.log:

running: igt/kms_frontbuffer_tracking/fbcdrrs-1p-primscrn-spr-indfb-move

[59/97] skip: 22, pass: 31, fail: 6 \                                   
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_1/fi-cnl-y3/33 : FAILURE
CI_IGT_test runtime 843 seconds
Rebooting fi-cnl-y3

Last dmesg:
<6>[  347.235581] Console: switching to colour dummy device 80x25
<7>[  347.235634] [IGT] kms_frontbuffer_tracking: executing
Comment 13 Marta Löfstedt 2018-03-19 07:55:32 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_2/fi-cnl-y3/igt@pm_rpm@gem-execbuf-stress-extra-wait.html

this run.log looks strange:

[81/97] skip: 21, pass: 54, fail: 6 |
running: igt/pm_rpm/modeset-lpsp     

[81/97] skip: 21, pass: 54, fail: 6 /
Build timed out (after 18 minutes). Marking the build as aborted.
Set build name.
New build name is 'drmtip_2/fi-cnl-y3/9'
pass: igt/pm_rpm/modeset-lpsp        

[82/97] skip: 21, pass: 55, fail: 6 /
running: igt/kms_ccs/pipe-a-bad-rotation-90

[82/97] skip: 21, pass: 55, fail: 6 -      
SSH: Connecting from host [cnl-y3]
SSH: Connecting with configuration [archive] ...
pass: igt/kms_ccs/pipe-a-bad-rotation-90

[83/97] skip: 21, pass: 56, fail: 6 -
running: igt/pm_rpm/gem-execbuf-stress-extra-wait

SSH: Disconnecting configuration [archive] ...
SSH: Transferred 90 file(s)
Build was aborted
Notifying upstream projects of job completion
Finished: ABORTED
Completed CI_IGT_test drmtip_2/fi-cnl-y3/9 : ABORTED
CI_IGT_test runtime 1084 seconds
Rebooting fi-cnl-y3

So, I would blame on CI system and/or network issue
Comment 14 Marta Löfstedt 2018-03-20 07:24:46 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_3/fi-cnl-y3/igt@kms_atomic_transition@plane-all-modeset-transition-fencing.html

run.log:
[50/97] skip: 17, pass: 29, fail: 4 /
running: igt/kms_atomic_transition/plane-all-modeset-transition-fencing
[50/97] skip: 17, pass: 29, fail: 4 -                                  
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_3/fi-cnl-y3/19 : FAILURE
CI_IGT_test runtime 540 seconds
Rebooting fi-cnl-y3

Last dmesg:
<7>[  355.495931] [IGT] kms_atomic_transition: starting subtest plane-all-modeset-transition-fencing
...
<7>[  365.411238] [drm:audio_config_hdmi_pixel_clock [i915]] HDMI audio pixel clock setting for 297000 not found, falling back to defaults
<7>[  365.411284] [drm:audio_config_hdmi_pixel_clock [i915]] Configuring HDMI audio for pixel clock 25200 (0x00010000)
<7>[  365.411329] [drm:hsw_audio_config_update [i915]] using automatic N
Comment 15 Marta Löfstedt 2018-03-20 07:30:27 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_4/fi-cnl-y3/igt@kms_atomic_transition@plane-all-modeset-transition.html

run.log:
running: igt/kms_atomic_transition/plane-all-modeset-transition

[29/97] skip: 11, pass: 16, fail: 2 /                          
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_4/fi-cnl-y3/7 : FAILURE
CI_IGT_test runtime 556 seconds
Rebooting fi-cnl-y3

dmesg:
<7>[  136.865074] [IGT] kms_atomic_transition: starting subtest plane-all-modeset-transition
...
<7>[  307.183896] [drm:intel_enable_shared_dpll [i915]] enable DPLL 0 (active 4, on? 0) for crtc 99
<7>[  307.184048] [drm:intel_enable_shared_dpll [i915]] enabling DPLL 0
<7>[  307.184538] [drm:intel_dp_dual_mode_set_tmds_output [i915]] Enabling DP dual mode adaptor TMDS output

pstore is missing beader of back trace however, the stuff there indicates some CPU scheduling issue:

<4>[  326.290574] Call Trace:
<4>[  326.290584]  ? do_idle+0x188/0x1d0
<4>[  326.290591]  ? cpu_startup_entry+0x14/0x20
<4>[  326.290597]  ? start_secondary+0x129/0x160
<4>[  326.290603]  ? secondary_startup_64+0xa5/0xb0
...
<1>[  326.290711] RIP: acpi_idle_enter+0x103/0x290 RSP: ffff9965800ffeb0
<4>[  326.290717] ---[ end trace 5cba7989ef49428d ]---
<0>[  326.588085] Kernel panic - not syncing: Attempted to kill the idle task!
<4>[  326.588097] ------------[ cut here ]------------
<4>[  326.588100] sched: Unexpected reschedule of offline CPU#3!
Comment 16 Marta Löfstedt 2018-03-26 13:52:43 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_7/fi-cnl-y3/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html

run.log:
running: igt/kms_plane/plane-panning-bottom-right-suspend-pipe-a-planes

[14/98] skip: 4, pass: 10 -                                            
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_7/fi-cnl-y3/28 : FAILURE
CI_IGT_test runtime 182 seconds
Rebooting fi-cnl-y3


Last dmesg:
<6>[   77.940997] PM: suspend exit
<7>[   78.001174] [drm:drm_mode_addfb2] [FB:158]
<7>[   78.093891] [drm:drm_mode_setcrtc] [CRTC:51:pipe A]
<7>[   78.093934] [drm:drm_mode_setcrtc] [CONNECTOR:101:eDP-1]
Comment 17 Marta Löfstedt 2018-03-26 13:59:23 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_7/fi-cnl-y3/igt@kms_frontbuffer_tracking@fbcdrrs-1p-primscrn-cur-indfb-draw-mmap-cpu.html

run.log:
running: igt/kms_frontbuffer_tracking/fbcdrrs-1p-primscrn-cur-indfb-draw-mmap-cpu

[84/99] skip: 21, pass: 56, fail: 7 |                                            
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_7/fi-cnl-y3/9 : FAILURE
CI_IGT_test runtime 844 seconds
Rebooting fi-cnl-y3

Last dmesg:
<6>[  371.069493] Console: switching to colour frame buffer device 480x135
<6>[  371.232537] Console: switching to colour dummy device 80x25
<7>[  371.232592] [IGT] kms_frontbuffer_tracking: executing
Comment 18 Marta Löfstedt 2018-04-04 06:42:04 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_11/fi-cnl-psr/igt@kms_frontbuffer_tracking@fbcpsr-farfromfence.html

run.log:
running: igt/kms_frontbuffer_tracking/fbcpsr-farfromfence

[18/99] skip: 6, pass: 12 -                              
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_11/fi-cnl-psr/4 : FAILURE
CI_IGT_test runtime 633 seconds
Rebooting fi-cnl-psr

dmesg:
<7>[  105.976631] [IGT] kms_frontbuffer_tracking: starting subtest fbcpsr-farfromfence
...
<7>[  113.585117] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 02
<6>[  114.203744] e1000e: enp0s31f6 NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
<6>[  114.203755] e1000e 0000:00:1f.6 enp0s31f6: 10/100 speed: disabling TSO

pstore:
<0>[  505.938269]   <idle>-0       3..s1 65836456us : execlists_submission_tasklet: rcs0 csb[2]: status=0x00000001:0x00000000, active=0x1
...
<0>[  505.946903] kworker/-1165    3.... 110793375us : i915_request_retire: bcs0 fence 28d:2, global_seqno 2, current 2

then just a backtrace of a sysrq
Comment 19 Marta Löfstedt 2018-04-04 06:43:30 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_8/fi-cnl-y3/igt@kms_flip@single-buffer-flip-vs-dpms-off-vs-modeset.html

run.log:
running: igt/kms_flip/single-buffer-flip-vs-dpms-off-vs-modeset

[53/98] skip: 20, pass: 32, fail: 1 /    
...
Completed CI_IGT_test drmtip_8/fi-cnl-y3/28 : FAILURE
CI_IGT_test runtime 843 seconds
Rebooting fi-cnl-y3

dmesg:
<6>[  459.685160] Console: switching to colour frame buffer device 480x135
<6>[  459.841283] Console: switching to colour dummy device 80x25
<7>[  459.841345] [IGT] kms_flip: executing
Comment 20 Marta Löfstedt 2018-04-04 06:44:37 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_8/fi-cnl-y3/igt@kms_cursor_crc@cursor-128x128-offscreen.html

run.log:
running: igt/kms_cursor_crc/cursor-128x128-offscreen

[25/98] skip: 4, pass: 19, fail: 2 /                
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_8/fi-cnl-y3/30 : FAILURE
CI_IGT_test runtime 543 seconds
Rebooting fi-cnl-y3

dmesg:
<6>[  277.259602] Console: switching to colour frame buffer device 480x135
<6>[  277.419525] Console: switching to colour dummy device 80x25
<7>[  277.419588] [IGT] kms_cursor_crc: executing
Comment 21 Marta Löfstedt 2018-04-04 06:47:05 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_11/fi-cnl-y3/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-cur-indfb-draw-mmap-cpu.html

run.log:
running: igt/kms_frontbuffer_tracking/fbcpsr-2p-primscrn-cur-indfb-draw-mmap-cpu

[33/98] skip: 8, pass: 23, warn: 1, fail: 1 /                                   
FATAL: command execution failed
...
Completed CI_IGT_test drmtip_11/fi-cnl-y3/28 : FAILURE
CI_IGT_test runtime 543 seconds
Rebooting fi-cnl-y3

dmesg:
<7>[  185.095805] [drm:drm_setup_crtcs] desired mode 3840x2160 set on crtc 75 (0,0)
<6>[  185.229077] Console: switching to colour dummy device 80x25
<7>[  185.229138] [IGT] kms_frontbuffer_tracking: executing

Note, this is the third non explainable incomplete on the fi-cnl-y3 where it appear to just hang even before the subtest started.
Comment 23 Marta Löfstedt 2018-04-09 07:36:16 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4034/fi-cnl-y3/igt@gem_ctx_param@basic-default.html

run.log:
running: igt/gem_ctx_param/basic-default

[019/285] pass: 19 \                    
FATAL: command execution failed
...
Completed CI_IGT_test CI_DRM_4034/fi-cnl-y3/0 : FAILURE
CI_IGT_test runtime 543 seconds
Rebooting fi-cnl-y3


last dmesg:
<6>[   44.554558] Console: switching to colour frame buffer device 480x135
<6>[   44.672769] Console: switching to colour dummy device 80x25
<7>[   44.672854] [IGT] gem_ctx_param: executing
Comment 26 Marta Löfstedt 2018-04-09 13:25:13 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_16/fi-cnl-y3/igt@gem_linear_blits@normal.html

<7>[   31.392489] [drm:intel_edp_drrs_downclock_work [i915]] eDP Refresh Rate set to : 48Hz
<2>[   62.308422] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
<2>[   62.308423] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
<2>[   62.308579] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
<2>[   62.308581] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
<2>[   62.308582] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
<2>[   62.308585] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
<6>[   62.309454] CPU0: Core temperature/speed normal
<6>[   62.309456] CPU2: Core temperature/speed normal
<6>[   62.309458] CPU1: Package temperature/speed normal
<6>[   62.309459] CPU3: Package temperature/speed normal
<6>[   62.309460] CPU2: Package temperature/speed normal
<6>[   62.309461] CPU0: Package temperature/speed normal
Comment 27 Marta Löfstedt 2018-04-09 13:27:40 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_16/fi-cnl-y3/igt@gem_mocs_settings@mocs-settings-ctx-render.html

<2>[  541.966099] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1)
<2>[  541.966101] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1)
<2>[  541.966266] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
<2>[  541.966267] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
<2>[  541.966268] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
<2>[  541.966271] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
<6>[  541.967032] CPU1: Core temperature/speed normal
<6>[  541.967035] CPU0: Package temperature/speed normal
<6>[  541.967035] CPU3: Core temperature/speed normal
<6>[  541.967037] CPU2: Package temperature/speed normal
<6>[  541.967039] CPU1: Package temperature/speed normal
<6>[  541.967040] CPU3: Package temperature/speed normal
<7>[  548.471019] [IGT] gem_exec_nop: exiting, ret=0
<6>[  548.525057] Console: switching to colour frame buffer device 480x135
<6>[  548.773280] Console: switching to colour dummy device 80x25
<7>[  548.773370] [IGT] gem_mocs_settings: executing
Comment 28 Marta Löfstedt 2018-04-10 05:57:01 UTC
Here are some new one with:
<2>[  243.181515] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1)

close in time to the incomplete:
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_16/fi-cnl-y3/igt@syncobj_wait@wait-all-for-submit-snapshot.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_16/fi-cnl-y3/igt@gem_exec_await@wide-all.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_16/fi-cnl-y3/igt@kms_cursor_legacy@pipe-a-single-bo.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_15/fi-cnl-y3/igt@kms_cursor_legacy@cursor-vs-flip-atomic-transitions-varying-size.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_15/fi-cnl-y3/igt@prime_busy@wait-after-render.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_15/fi-cnl-y3/igt@gem_ctx_isolation@bcs0-dirty-switch.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_15/fi-cnl-y3/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-shrfb-draw-mmap-wc.htmlhttps://intel-gfx-ci.01.org/tree/drm-tip/drmtip_14/fi-cnl-y3/igt@kms_addfb_basic@unused-handle.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_14/fi-cnl-y3/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-msflip-blt.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_14/fi-cnl-y3/igt@perf_pmu@rc6-runtime-pm-long.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_13/fi-cnl-y3/igt@kms_chv_cursor_fail@pipe-b-256x256-right-edge.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_13/fi-cnl-y3/igt@kms_universal_plane@universal-plane-gen9-features-pipe-c.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_13/fi-cnl-y3/igt@kms_psr_sink_crc@primary_mmap_gtt.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_13/fi-cnl-y3/igt@kms_chv_cursor_fail@pipe-c-64x64-left-edge.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_13/fi-cnl-y3/igt@pm_lpsp@edp-native.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_13/fi-cnl-y3/igt@kms_ccs@pipe-b-bad-pixel-format.html
Comment 29 Marta Löfstedt 2018-04-10 06:12:21 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_15/fi-cnl-y3/igt@kms_atomic_transition@plane-all-modeset-transition-fencing-internal-panels.html

this has a partial pstore:
<4>[  179.273994] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  179.273996] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  179.273998] PKRU: 55555554
<4>[  179.273999] Call Trace:
<4>[  179.274002]  <IRQ>
<4>[  179.274007]  enqueue_task_fair+0x5f/0x810
<4>[  179.274011]  ttwu_do_activate+0x49/0xa0
<4>[  179.274014]  try_to_wake_up+0x228/0x660
<4>[  179.274019]  ? clock_was_set_work+0x20/0x20
<4>[  179.274021]  hrtimer_wakeup+0x19/0x20
<4>[  179.274024]  __hrtimer_run_queues+0x11e/0x580
<4>[  179.274028]  hrtimer_interrupt+0xea/0x250
<4>[  179.274033]  smp_apic_timer_interrupt+0x7b/0x2d0
<4>[  179.274036]  apic_timer_interrupt+0xf/0x20
<4>[  179.274038]  </IRQ>
<4>[  179.274041] RIP: 0010:cpuidle_enter_state+0xad/0x370
<4>[  179.274043] RSP: 0018:ffffffffb2203e80 EFLAGS: 00000216 ORIG_RAX: ffffffffffffff12
<4>[  179.274046] RAX: ffffffffb2216500 RBX: 000000000028850f RCX: 0000000000000000
<4>[  179.274048] RDX: 0000000000000046 RSI: ffffffffb20eb229 RDI: ffffffffb2098fcf
<4>[  179.274049] RBP: ffff94aca39153e8 R08: 000000000000097f R09: 0000000000000000
<4>[  179.274051] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003
<4>[  179.274053] R13: ffffffffb2298718 R14: 0000000000000000 R15: 00000029bd029e15
<4>[  179.274060]  do_idle+0x183/0x1d0
<4>[  179.274063]  cpu_startup_entry+0x6a/0x70
<4>[  179.274068]  start_kernel+0x447/0x467
<4>[  179.274073]  secondary_startup_64+0xa5/0xb0
<4>[  179.274078] Code: be 3b 08 00 00 48 c7 c7 d8 e9 05 b2 c6 05 3a 0a 24 01 01 e8 ae c6 01 00 48 8b 0c 24 48 85 c9 0f 84 3b f9 ff ff 8b 05 9b 8a 1c 01 <4c> 8b 01 85 c0 0f 85 e6 02 00 00 41 83 be 60 0a 00 00 01 0f 86 
<1>[  179.274125] RIP: enqueue_entity+0x795/0xfc0 RSP: ffff94acaf803dd8
<4>[  179.274128] ---[ end trace 7b0505848e011930 ]---
Comment 30 Marta Löfstedt 2018-04-10 06:17:44 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_14/fi-cnl-y3/igt@kms_vblank@pipe-c-query-forked-hang.html

This coyuld have gone on for a while, so hard to say what actually started it:

pstore:
<4>[  191.570549]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[  191.570557]  ? rcu_sync_lockdep_assert+0x25/0x50
<4>[  191.570564]  ? __sb_start_write+0xd9/0x1f0
<4>[  191.570571]  ? __sb_start_write+0xf3/0x1f0
<4>[  191.570579]  vfs_write+0xbd/0x1b0
<4>[  191.570586]  SyS_write+0x50/0xc0
<4>[  191.570592]  ? do_syscall_64+0x19/0x1a0
<4>[  191.570600]  do_syscall_64+0x65/0x1a0
<4>[  191.570607]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
<4>[  191.570614] RIP: 0033:0x7f470179f154
<4>[  191.570620] RSP: 002b:00007ffe5b5bf928 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4>[  191.570631] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f470179f154
<4>[  191.570641] RDX: 0000000000000004 RSI: 0000565529475160 RDI: 0000000000000006
<4>[  191.570650] RBP: 0000565529475160 R08: 00007f4702517980 R09: 0000000000000000
<4>[  191.570660] R10: 0000000000000000 R11: 0000000000000246 R12: 000056552945e3f0
<4>[  191.570669] R13: 0000000000000004 R14: 00007f4701a772a0 R15: 00007f4701a76760
<4>[  191.570682] Code: 31 e4 f7 c5 00 02 00 00 c6 43 48 00 0f 85 79 ff ff ff 55 9d e8 47 05 f9 ff 4c 89 e0 5b 5d 41 5c 41 5e c3 48 8b 43 08 f0 ff 40 08 <0f> 0b 4d 85 e4 0f 84 42 ff ff ff 41 8b 44 24 18 85 c0 0f 85 38 
<4>[  191.570758] WARNING: CPU: 3 PID: 1370 at kernel/trace/ring_buffer.c:3516 rb_get_reader_page+0x1eb/0x230
<4>[  191.570770] ---[ end trace 4592c359c8977ce4 ]---
<4>[  191.570780] WARNING: CPU: 3 PID: 1370 at kernel/trace/ring_buffer.c:3516 rb_get_reader_page+0x1eb/0x230
<4>[  191.570792] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm e1000e mei_me mei prime_numbers
<4>[  191.570838] CPU: 3 PID: 1370 Comm: kms_vblank Tainted: G     U  W        4.16.0-rc7-ge023242a3eba-drmtip_14+ #1
<4>[  191.570851] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X122.B01.1801151045 01/15/2018
<4>[  191.570868] RIP: 0010:rb_get_reader_page+0x1eb/0x230
<4>[  191.570875] RSP: 0018:ffffbad080b67a48 EFLAGS: 00010006
<4>[  191.570883] RAX: ffff9941a681c358 RBX: ffff9941a6259548 RCX: 0000000000000ff0
<4>[  191.570893] RDX: 0000000000000ff0 RSI: ffffffff906c0c78 RDI: ffffffff8e14e581
<4>[  191.570902] RBP: 0000000000000086 R08: ffffffff8e14f48b R09: 0000000000000000
<4>[  191.570912] R10: ffffbad080b67a00 R11: ffff994196c0c040 R12: ffff9941a5fc84b8
<4>[  191.570921] R13: 0000000000000003 R14: 0000000000000003 R15: ffff9941a681c358
<4>[  191.570931] FS:  00007f4702517980(0000) GS:ffff9941af980000(0000) knlGS:0000000000000000
<4>[  191.570941] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  191.570949] CR2: 00000000ffffffff CR3: 000000025afb2002 CR4: 0000000000760ee0
<4>[  191.570959] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  191.570968] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  191.570977] PKRU: 55555554
<4>[  191.570982] Call Trace:
<4>[  191.570989]  rb_advance_reader+0x9/0xd0
<4>[  191.570996]  ring_buffer_consume+0xd8/0x170
<4>[  191.571005]  ftrace_dump+0x16a/0x260
<4>[  191.571013]  trace_die_handler+0x1b/0x30
<4>[  191.571019]  notifier_call_chain+0x2f/0x90
<4>[  191.571027]  __atomic_notifier_call_chain+0x71/0x110
<4>[  191.571036]  notify_die+0x5a/0xa0
<4>[  191.571043]  __die+0x8f/0xd0
<4>[  191.571050]  die+0x25/0x40
<4>[  191.571056]  general_protection+0x25/0x50
<4>[  191.571093] RIP: 0010:i915_request_retire+0x28f/0x980 [i915]
<4>[  191.571101] RSP: 0018:ffffbad080b67cd0 EFLAGS: 00010286
<4>[  191.571109] RAX: 0000000080000000 RBX: ffff99416a9d0040 RCX: 0000000000000001
<4>[  191.571119] RDX: 0000000080000001 RSI: 00000000df20c708 RDI: 00000000ffffffff
<4>[  191.571128] RBP: ffffbad080b67d08 R08: ffff994196c0c948 R09: 0000000082a2cc13
<4>[  191.571137] R10: 0000000000000000 R11: 0000000000000001 R12: ffff994195d72158
<4>[  191.571147] R13: ffff99416a9d02a8 R14: ffff99416a9d02a0 R15: ffff994188f0bb48
<4>[  191.571192]  i915_retire_requests+0x180/0x210 [i915]
<4>[  191.571232]  i915_gem_wait_for_idle+0x73/0x170 [i915]
<4>[  191.571269]  i915_drop_caches_set+0x17b/0x190 [i915]
<4>[  191.571279]  simple_attr_write+0xab/0xc0
<4>[  191.571287]  full_proxy_write+0x4b/0x70
<4>[  191.571295]  __vfs_write+0x2e/0x150
<4>[  191.571302]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[  191.571310]  ? rcu_sync_lockdep_assert+0x25/0x50
<4>[  191.571318]  ? __sb_start_write+0xd9/0x1f0
<4>[  191.571324]  ? __sb_start_write+0xf3/0x1f0
<4>[  191.571332]  vfs_write+0xbd/0x1b0
<4>[  191.571340]  SyS_write+0x50/0xc0
<4>[  191.571346]  ? do_syscall_64+0x19/0x1a0
<4>[  191.571353]  do_syscall_64+0x65/0x1a0
<4>[  191.571360]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
<4>[  191.571368] RIP: 0033:0x7f470179f154
<4>[  191.571374] RSP: 002b:00007ffe5b5bf928 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4>[  191.571385] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f470179f154
<4>[  191.571394] RDX: 0000000000000004 RSI: 0000565529475160 RDI: 0000000000000006
<4>[  191.571404] RBP: 0000565529475160 R08: 00007f4702517980 R09: 0000000000000000
<4>[  191.571413] R10: 0000000000000000 R11: 0000000000000246 R12: 000056552945e3f0
<4>[  191.571422] R13: 0000000000000004 R14: 00007f4701a772a0 R15: 00007f4701a76760
<4>[  191.571435] Code: 31 e4 f7 c5 00 02 00 00 c6 43 48 00 0f 85 79 ff ff ff 55 9d e8 47 05 f9 ff 4c 89 e0 5b 5d 41 5c 41 5e c3 48 8b 43 08 f0 ff 40 08 <0f> 0b 4d 85 e4 0f 84 42 ff ff ff 41 8b 44 24 18 85 c0 0f 85 38 
<4>[  191.571511] WARNING: CPU: 3 PID: 1370 at kernel/trace/ring_buffer.c:3516 rb_get_reader_page+0x1eb/0x230
<4>[  191.571523] ---[ end trace 4592c359c8977ce5 ]---
Comment 31 Marta Löfstedt 2018-04-11 05:56:07 UTC
A BAT one with temperature warnings in dmesg:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4041/fi-cnl-y3/igt@gem_exec_flush@basic-uc-pro-default.html
Comment 32 Marta Löfstedt 2018-04-11 07:57:48 UTC
I will pull out the proven temperature related fi-cnl-y3 incomplets from this bug to bug 105985. Basically all fi-cnl-y3 links in this bug should be disregarded.
Comment 36 Martin Peres 2018-11-01 16:58:27 UTC
These machines are not in CI anymore. Closing!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.