Bug 103713

Summary: [CI][BAT] igt@* - incomplete - Softdog | 18 minutes timeout - [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:37:pipe [A|B]] flip_done timed out
Product: DRI Reporter: Marta Löfstedt <marta.lofstedt>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: elizabethx.de.la.torre.mena, intel-gfx-bugs, james.ausmus, thsnielsen
Version: DRI git   
Hardware: Other   
OS: All   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=104573
https://bugs.freedesktop.org/show_bug.cgi?id=106127
Whiteboard: ReadyForDev
i915 platform: SNB i915 features: display/Other

Description Marta Löfstedt 2017-11-13 11:33:06 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3336/fi-snb-2520m/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html

runlog: 
[245/289] skip: 33, pass: 212 /                        
owatch: TIMEOUT!
owatch: timeout for /dev/watchdog0 set to 10 (requested 10)
FATAL: command execution failed
Comment 1 Marta Löfstedt 2017-11-14 08:39:13 UTC
Also, here:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7104/fi-snb-2520m/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html

last dmesg:
<6>[  513.335237] ata1.00: configured for UDMA/100
<6>[  514.172258] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
<3>[  522.730353] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:46:pipe B] flip_done timed out

run.log:
[245/289] skip: 33, pass: 212 /                        
owatch: TIMEOUT!
owatch: timeout for /dev/watchdog0 set to 10 (requested 10)
FATAL: command execution failed
java.io.EOFException
Comment 2 Elizabeth 2017-11-14 19:54:35 UTC
Rising priority since it is BAT.
Comment 3 Marta Löfstedt 2017-12-14 07:37:36 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3511/fi-snb-2520m/igt@debugfs_test@read_all_entries.html

run.log:
running: igt/debugfs_test/read_all_entries

[002/288] pass: 2 -                       
owatch: TIMEOUT!
owatch: timeout for /dev/watchdog0 set to 10 (requested 10)
FATAL: command execution failed
java.io.EOFException

from dmesg:
<7>[   51.899994] [IGT] debugfs_test: starting subtest read_all_entries
<7>[   51.900302] [drm:drm_mode_addfb2] [FB:75]
<7>[   52.975295] [IGT] debugfs_test: exiting, ret=0
<3>[   63.455267] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:37:pipe A] flip_done timed out
Comment 4 Marta Löfstedt 2017-12-14 07:41:44 UTC
Also, here again on igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3510/fi-snb-2520m/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html

running: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-b

[244/288] skip: 33, pass: 211 |                        
owatch: TIMEOUT!
owatch: timeout for /dev/watchdog0 set to 10 (requested 10)
FATAL: command execution failed
java.io.EOFException

from dmesg:
<4>[  532.122277] vblank wait timed out on crtc 0
<4>[  532.122323] WARNING: CPU: 1 PID: 4056 at drivers/gpu/drm/drm_vblank.c:1073 drm_wait_one_vblank+0x179/0x180
<4>[  532.122328] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp snd_hda_intel snd_hda_codec snd_hwdep coretemp snd_hda_core mei_me crct10dif_pclmul crc32_pclmul snd_pcm e1000e mei lpc_ich ghash_clmulni_intel ptp pps_core prime_numbers
<4>[  532.122412] CPU: 1 PID: 4056 Comm: kms_pipe_crc_ba Tainted: G     U           4.15.0-rc3-CI-CI_DRM_3510+ #1
<4>[  532.122416] Hardware name: LENOVO 42962WU/42962WU, BIOS 8DET56WW (1.26 ) 12/01/2011
<4>[  532.122424] RIP: 0010:drm_wait_one_vblank+0x179/0x180
<4>[  532.122429] RSP: 0018:ffffc900010ef9b0 EFLAGS: 00010286
<4>[  532.122437] RAX: 000000000000001f RBX: ffff88010dd40000 RCX: 0000000000000006
<4>[  532.122441] RDX: 000000000000159f RSI: ffffffff81d05b21 RDI: ffffffff81cb6de6
<4>[  532.122445] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
<4>[  532.122449] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4>[  532.122453] R13: 0000000000000000 R14: 00000000000068a3 R15: ffff8801063d2ea0
<4>[  532.122459] FS:  00007fdc6fb83a40(0000) GS:ffff88011e240000(0000) knlGS:0000000000000000
<4>[  532.122463] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  532.122467] CR2: 00007f56a785f000 CR3: 00000001175fb004 CR4: 00000000000606e0
<4>[  532.122471] Call Trace:
<4>[  532.122493]  ? wait_woken+0x90/0x90
<4>[  532.122655]  ironlake_crtc_enable+0x7a3/0xc10 [i915]
<4>[  532.122808]  intel_update_crtc+0x29/0x80 [i915]
<4>[  532.122935]  intel_update_crtcs+0x52/0x70 [i915]
<4>[  532.123055]  intel_atomic_commit_tail+0x2f7/0xce0 [i915]
<4>[  532.123166]  ? __i915_sw_fence_complete+0x175/0x230 [i915]
<4>[  532.123278]  intel_atomic_commit+0x22a/0x2e0 [i915]
<4>[  532.123305]  restore_fbdev_mode_atomic+0x199/0x1f0
<4>[  532.123361]  drm_fb_helper_restore_fbdev_mode_unlocked+0x42/0x90
<4>[  532.123374]  drm_fb_helper_set_par+0x24/0x50
<4>[  532.123464]  intel_fbdev_set_par+0x11/0x60 [i915]
<4>[  532.123476]  fbcon_init+0x597/0x620
<4>[  532.123505]  visual_init+0xca/0x120
<4>[  532.123517]  do_bind_con_driver+0x1bf/0x3a0
<4>[  532.123546]  do_unbind_con_driver+0x1a1/0x210
<4>[  532.123578]  store_bind+0xb8/0x1b0
<4>[  532.123601]  kernfs_fop_write+0x10b/0x1a0
<4>[  532.123622]  __vfs_write+0x1e/0x130
<4>[  532.123633]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[  532.123639]  ? rcu_sync_lockdep_assert+0x25/0x50
<4>[  532.123647]  ? __sb_start_write+0xd5/0x1f0
<4>[  532.123653]  ? __sb_start_write+0xef/0x1f0
<4>[  532.123673]  vfs_write+0xc0/0x1b0
<4>[  532.123690]  SyS_write+0x40/0xa0
<4>[  532.123709]  entry_SYSCALL_64_fastpath+0x1c/0x89
<4>[  532.123715] RIP: 0033:0x7fdc6e0594a0
<4>[  532.123719] RSP: 002b:00007fff97d75818 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4>[  532.123727] RAX: ffffffffffffffda RBX: 000056403f43c060 RCX: 00007fdc6e0594a0
<4>[  532.123731] RDX: 0000000000000002 RSI: 000056403dfc1728 RDI: 0000000000000007
<4>[  532.123735] RBP: 00007fff97d75840 R08: 0000000000000000 R09: 0000000000000020
<4>[  532.123739] R10: 0000000000000006 R11: 0000000000000246 R12: 000056403dfc170c
<4>[  532.123743] R13: 000056403dfc16d6 R14: 000056403f43c0d3 R15: 000056403dfc1712
<4>[  532.123788] Code: 0f ff 31 c0 e9 32 ff ff ff 48 89 e6 4c 89 ff e8 2e c1 af ff 45 85 ed 0f 85 23 ff ff ff 89 ee 48 c7 c7 60 e6 cd 81 e8 f7 fb aa ff <0f> ff e9 0e ff ff ff 8b b7 08 01 00 00 48 8b 3f e9 72 fe ff ff 
<4>[  532.123985] ---[ end trace f873adf2e4d1d75e ]---
...
<3>[  542.186329] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:37:pipe A] flip_done timed out
Comment 5 Marta Löfstedt 2018-01-16 08:14:31 UTC
Note the: "[drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:46:pipe B] flip_done timed out" error print is almost always associated with this Softdog incomplete.
Comment 6 Marta Löfstedt 2018-03-12 11:21:40 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3910/fi-ctg-p8600/igt@debugfs_test@read_all_entries.html

from run.log:
running: igt/debugfs_test/read_all_entries

[002/288] pass: 2 -                       
owatch: TIMEOUT!
owatch: timeout for /dev/watchdog0 set to 10 (requested 10)
FATAL: command execution failed
Comment 8 Marta Löfstedt 2018-03-20 07:50:34 UTC
*** Bug 105598 has been marked as a duplicate of this bug. ***
Comment 9 Marta Löfstedt 2018-03-23 08:19:56 UTC
Here are some new incompletes with flip_done timeout previously in dmesg for drmtip_6:

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_6/fi-snb-2520m/igt@kms_draw_crc@draw-method-xrgb8888-pwrite-untiled.html
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_6/fi-snb-2520m/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-cur-indfb-draw-pwrite.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_6/fi-snb-2520m/igt@kms_flip@2x-flip-vs-dpms-off-vs-modeset-interruptible.html
also has:

<4>[  431.504965] ------------[ cut here ]------------
<4>[  431.504979] vblank wait timed out on crtc 0
<4>[  431.505028] WARNING: CPU: 1 PID: 2089 at drivers/gpu/drm/drm_vblank.c:1084 drm_wait_one_vblank+0x154/0x160
<4>[  431.505036] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_generic i915 btusb btrtl btbcm btintel bluetooth ecdh_generic x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel snd_hda_codec crct10dif_pclmul crc32_pclmul snd_hwdep ghash_clmulni_intel e1000e snd_hda_core mei_me mei lpc_ich snd_pcm prime_numbers
<4>[  431.505209] CPU: 1 PID: 2089 Comm: kms_atomic_tran Tainted: G     U           4.16.0-rc6-gdff9ece60048-drmtip_6+ #1
<4>[  431.505216] Hardware name: LENOVO 42962WU/42962WU, BIOS 8DET56WW (1.26 ) 12/01/2011
<4>[  431.505229] RIP: 0010:drm_wait_one_vblank+0x154/0x160
<4>[  431.505237] RSP: 0018:ffffb831405cfa78 EFLAGS: 00010296
<4>[  431.505251] RAX: 000000000000001f RBX: ffff9bfc8ab50000 RCX: 0000000000000001
<4>[  431.505258] RDX: 0000000080000001 RSI: ffffffff8e076cae RDI: 00000000ffffffff
<4>[  431.505265] RBP: 0000000000000000 R08: 00000000fabce84d R09: 0000000000000000
<4>[  431.505272] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4>[  431.505280] R13: 000000000000564a R14: ffff9bfc54f20010 R15: ffff9bfc8ab50000
<4>[  431.505289] FS:  00007f8579f09980(0000) GS:ffff9bfc9e240000(0000) knlGS:0000000000000000
<4>[  431.505297] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  431.505304] CR2: 00007f64dfe042b8 CR3: 0000000102b2c006 CR4: 00000000000606e0
<4>[  431.505311] Call Trace:
<4>[  431.505333]  ? wait_woken+0x90/0x90
<4>[  431.505527]  ironlake_crtc_enable+0x4b6/0xcd0 [i915]
<4>[  431.505705]  intel_update_crtc+0x2f/0x80 [i915]
<4>[  431.505862]  intel_update_crtcs+0x42/0x60 [i915]
<4>[  431.506004]  intel_atomic_commit_tail+0x1b9/0xc50 [i915]
<4>[  431.506152]  intel_atomic_commit+0x240/0x320 [i915]
<4>[  431.506173]  drm_mode_atomic_ioctl+0x7bd/0x940
<4>[  431.506224]  ? drm_atomic_set_property+0x4d0/0x4d0
<4>[  431.506238]  drm_ioctl_kernel+0x56/0xb0
<4>[  431.506257]  drm_ioctl+0x28f/0x330
<4>[  431.506274]  ? drm_atomic_set_property+0x4d0/0x4d0
<4>[  431.506316]  do_vfs_ioctl+0x8c/0x690
<4>[  431.506333]  ? __task_pid_nr_ns+0xbd/0x1d0
<4>[  431.506354]  SyS_ioctl+0x36/0x70
<4>[  431.506365]  ? do_syscall_64+0x19/0x1b0
<4>[  431.506380]  do_syscall_64+0x6b/0x1b0
<4>[  431.506400]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
<4>[  431.506410] RIP: 0033:0x7f85793a45d7
<4>[  431.506417] RSP: 002b:00007fff38ca9ed8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  431.506432] RAX: ffffffffffffffda RBX: 000055a861152900 RCX: 00007f85793a45d7
<4>[  431.506439] RDX: 00007fff38ca9f30 RSI: 00000000c03864bc RDI: 0000000000000003
<4>[  431.506447] RBP: 00007fff38ca9f30 R08: 000055a861153a00 R09: 000055a861153890
<4>[  431.506454] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000c03864bc
<4>[  431.506461] R13: 0000000000000003 R14: 000055a8611539c0 R15: 000055a8611539e0
<4>[  431.506494] Code: aa ff 0f 0b e9 3b ff ff ff 48 89 e6 4c 89 f7 e8 93 9c af ff 45 85 e4 0f 85 1d ff ff ff 89 ee 48 c7 c7 a8 0e 0c 8e e8 cc c5 aa ff <0f> 0b e9 08 ff ff ff 0f 1f 44 00 00 8b b7 f8 00 00 00 48 8b 3f 
<4>[  431.506887] WARNING: CPU: 1 PID: 2089 at drivers/gpu/drm/drm_vblank.c:1084 drm_wait_one_vblank+0x154/0x160
<4>[  431.506894] ---[ end trace 51dc0030db8196ae ]---
Comment 12 Marta Löfstedt 2018-04-09 11:49:56 UTC
*** Bug 105552 has been marked as a duplicate of this bug. ***
Comment 14 Martin Peres 2018-04-17 14:52:04 UTC
Setting to high because it affects BAT super reliably!
Comment 15 Marta Löfstedt 2018-04-18 09:22:29 UTC
reproduced on BAT runs incompletes:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4061/fi-snb-2520m/igt@debugfs_test@read_all_entries.html
Last dmesg:
<3>[   82.912315] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:39:pipe A] flip_done timed out

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4060/fi-snb-2520m/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html

<3>[  560.110427] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:51:pipe B] flip_done timed out
Comment 16 thsnielsen 2018-06-28 20:14:31 UTC
i have this on 3 machines thinkpad x220
[drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* CRTC:39:pipe A] flip_done timed out
[drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CONNECTOR:52:LVDS-1] flip_done timed out
[drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:28:primary A] flip_done timed out
[drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:39:pipe A] flip_done timed out

both opensuse leap 15 and opensuse tumbleweed :
Linux linux-mkb6 4.17.2-1-default #1 SMP PREEMPT Sat Jun 16 10:58:03 UTC 2018 (ddde22d) x86_64 x86_64 x86_64 GNU/Linux

reproduces every time at shutdown and imposes delays.
Anything i can provide to help (supposing i am really onto the right problem :-)
Comment 17 Daniel Vetter 2018-07-04 19:30:05 UTC
Limiting to Sandybridge, the
Comment 18 Daniel Vetter 2018-07-04 19:30:53 UTC
Limiting to SNB. The ilk one isn't tracked in cibuglog anymore afaict.

And the i965/g4x version of the flip_done timeout is tracked in bug #105255
Comment 19 Martin Peres 2018-09-04 12:17:01 UTC
*** Bug 104573 has been marked as a duplicate of this bug. ***
Comment 20 Martin Peres 2018-10-24 13:56:26 UTC
Also seen on ICL: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_132/fi-icl-u2/igt@kms_plane_multiple@atomic-pipe-c-tiling-y.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_132/fi-icl-u/igt@kms_chv_cursor_fail@pipe-a-128x128-bottom-edge.html

<3> [191.798620] [drm:intel_pipe_update_start [i915]] *ERROR* PSR idle timed out 0xc4030209, atomic update may fail
<3> [191.801937] [drm:intel_pipe_update_start [i915]] *ERROR* Potential atomic update failure on pipe A
<3> [201.995909] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:45:pipe A] flip_done timed out
Comment 21 Maarten Lankhorst 2018-10-26 10:04:00 UTC
Different bug on ICL. More related to PSR, the other to CRTC enable.
Comment 22 Jani Saarinen 2018-10-26 14:23:38 UTC
Need also wait for 2392 BIOS results.
Comment 23 Martin Peres 2018-10-26 14:33:00 UTC
(In reply to Maarten Lankhorst from comment #21)
> Different bug on ICL. More related to PSR, the other to CRTC enable.

Thanks for the info, here it is: https://bugs.freedesktop.org/show_bug.cgi?id=108567
Comment 24 James Ausmus 2018-10-26 17:18:24 UTC
Moving back to Jani and SNB-only, as the ICL/PSR related bug got split to 108567
Comment 25 Ville Syrjala 2018-11-19 14:30:38 UTC
commit 03981c6ebec4fc7056b9b45f847393aeac90d060
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Wed Nov 14 19:34:40 2018 +0200

    drm/i915: Disable LP3 watermarks on all SNB machines
Comment 26 Francesco Balestrieri 2018-12-28 09:38:05 UTC
Looks like it's gone, closing.
Comment 27 Ville Syrjala 2019-02-25 16:10:41 UTC
For future reference:
bug 101269 shows a Thinkpad T420 being affected
bug 104573 comment 5 suggests that Lifebook T901 is also affected

Seems to be a rather widespread issue.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.