Summary: | [CI] igt@kms_flip@* - fail - Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0 | ||
---|---|---|---|
Product: | DRI | Reporter: | Marta Löfstedt <marta.lofstedt> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | RESOLVED WORKSFORME | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | arkadiusz.hiler, intel-gfx-bugs |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | ALL | i915 features: | display/Other |
Description
Marta Löfstedt
2018-03-23 08:54:52 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_7/fi-cnl-y3/igt@kms_flip@wf_vblank-ts-check-interruptible.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_7/fi-kbl-r/igt@kms_flip@wf_vblank-ts-check-interruptible.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_7/fi-skl-6600u/igt@kms_flip@wf_vblank-ts-check-interruptible.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_8/fi-ctg-p8600/igt@kms_flip@blocking-wf_vblank.html (kms_flip:1268) CRITICAL: Test assertion failure function calibrate_ts, file ../tests/kms_flip.c:1126: (kms_flip:1268) CRITICAL: Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0 (kms_flip:1268) CRITICAL: Last errno: 16, Device or resource busy Subtest blocking-wf_vblank failed. We also got the following WARNING in dmesg: [ 300.104209] vblank wait timed out on crtc 1 https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_8/fi-ctg-p8600/igt@kms_flip@basic-flip-vs-wf_vblank.html (kms_flip:1554) CRITICAL: Test assertion failure function calibrate_ts, file ../tests/kms_flip.c:1126: (kms_flip:1554) CRITICAL: Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0 (kms_flip:1554) CRITICAL: Last errno: 16, Device or resource busy Subtest basic-flip-vs-wf_vblank failed. https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_21/fi-kbl-r/igt@kms_flip@flip-vs-blocking-wf-vblank.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_21/fi-skl-6600u/igt@kms_flip@flip-vs-blocking-wf-vblank.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_23/fi-ctg-p8600/igt@kms_flip@2x-flip-vs-wf_vblank.html (kms_flip:1332) CRITICAL: Test assertion failure function calibrate_ts, file ../tests/kms_flip.c:1126: (kms_flip:1332) CRITICAL: Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0 (kms_flip:1332) CRITICAL: Last errno: 16, Device or resource busy Subtest 2x-flip-vs-wf_vblank failed. [ 408.544218] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:39:pipe A] flip_done timed out [ 418.784228] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CONNECTOR:52:LVDS-1] flip_done timed out [ 429.024211] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [PLANE:28:primary A] flip_done timed out [ 429.128199] ------------[ cut here ]------------ [ 429.128211] vblank wait timed out on crtc 0 [ 429.128268] WARNING: CPU: 0 PID: 1332 at drivers/gpu/drm/drm_vblank.c:1084 drm_wait_one_vblank+0x175/0x180 [ 429.128276] Modules linked in: vgem i915 coretemp snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core lpc_ich snd_pcm mei_me e1000e mei prime_numbers [ 429.128400] CPU: 0 PID: 1332 Comm: kms_flip Tainted: G U 4.17.0-rc1-gc1645edc253f-drmtip_23+ #1 [ 429.128409] Hardware name: LENOVO 74591P0/74591P0, BIOS 6DET28WW (1.05 ) 07/30/2008 [ 429.128421] RIP: 0010:drm_wait_one_vblank+0x175/0x180 [ 429.128429] RSP: 0018:ffffa4618075ba38 EFLAGS: 00010282 [ 429.128446] RAX: 0000000000000000 RBX: ffff939725320000 RCX: 0000000000000001 [ 429.128454] RDX: 0000000080000001 RSI: ffffffffb6084f36 RDI: 00000000ffffffff [ 429.128463] RBP: 0000000000000000 R08: 0000000033a4da69 R09: 0000000000000000 [ 429.128471] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 429.128480] R13: 0000000000004a85 R14: ffff93971ed05d30 R15: ffff93972dc52e98 [ 429.128490] FS: 00007ff258c87980(0000) GS:ffff93973bc00000(0000) knlGS:0000000000000000 [ 429.128499] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 429.128508] CR2: 00007f77a19f0068 CR3: 000000012cdf2000 CR4: 00000000000406f0 [ 429.128516] Call Trace: [ 429.128537] ? wait_woken+0x90/0x90 [ 429.128714] intel_pre_plane_update+0x17c/0x220 [i915] [ 429.128883] intel_atomic_commit_tail+0xce/0xd20 [i915] [ 429.129067] intel_atomic_commit+0x240/0x320 [i915] [ 429.129087] ? drm_mode_getcrtc+0x170/0x170 [ 429.129102] drm_atomic_helper_set_config+0x7b/0x90 [ 429.129109] __drm_mode_set_config_internal+0x62/0x120 [ 429.129109] drm_mode_setcrtc+0x41b/0x5c0 [ 429.129109] ? drm_mode_getcrtc+0x170/0x170 [ 429.129109] drm_ioctl_kernel+0x7c/0xf0 [ 429.129109] drm_ioctl+0x2e6/0x3a0 [ 429.129109] ? drm_mode_getcrtc+0x170/0x170 [ 429.129109] do_vfs_ioctl+0xa0/0x6c0 [ 429.129109] ? lock_acquire+0xa6/0x210 [ 429.129109] ksys_ioctl+0x35/0x60 [ 429.129109] __x64_sys_ioctl+0x11/0x20 [ 429.129109] do_syscall_64+0x55/0x190 [ 429.129109] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 429.129109] RIP: 0033:0x7ff2579d85d7 [ 429.129109] RSP: 002b:00007ffdb09af698 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 429.129109] RAX: ffffffffffffffda RBX: 00007ffdb09af6d0 RCX: 00007ff2579d85d7 [ 429.129109] RDX: 00007ffdb09af6d0 RSI: 00000000c06864a2 RDI: 0000000000000003 [ 429.129109] RBP: 00007ffdb09af6d0 R08: 0000000000000000 R09: 0000000000000000 [ 429.129109] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c06864a2 [ 429.129109] R13: 0000000000000003 R14: 00007ff2588856d1 R15: 00007ff258886c34 [ 429.129109] Code: 2f ff ff ff e8 1d 71 a7 ff 48 89 e6 4c 89 f7 e8 c2 83 ac ff 45 85 e4 0f 85 0c ff ff ff 89 ee 48 c7 c7 38 fa 0c b6 e8 1b 6e a7 ff <0f> 0b e9 f7 fe ff ff 0f 1f 40 00 8b b7 f8 00 00 00 48 8b 3f e9 [ 429.129109] WARNING: CPU: 0 PID: 1332 at drivers/gpu/drm/drm_vblank.c:1084 drm_wait_one_vblank+0x175/0x180 [ 429.129109] irq event stamp: 111740 [ 429.129109] hardirqs last enabled at (111739): [<ffffffffb50fbf06>] console_unlock+0x426/0x640 [ 429.129109] hardirqs last disabled at (111740): [<ffffffffb5a0111c>] error_entry+0x7c/0x100 [ 429.129109] softirqs last enabled at (111456): [<ffffffffb5c0032b>] __do_softirq+0x32b/0x4e1 [ 429.129109] softirqs last disabled at (111449): [<ffffffffb508f594>] irq_exit+0xa4/0xb0 [ 429.129109] ---[ end trace 89784d36fb754821 ]--- Used to happen every roughly every 3 runs, not it has not happened since drmtip_48 (3 weeks, 5 days / 18 runs ago). Closing! I closed it too soon... https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_90/fi-cfl-s3/igt@kms_flip@wf_vblank-ts-check.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_90/fi-skl-6600u/igt@kms_flip@plain-flip-fb-recreate-interruptible.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_90/fi-kbl-r/igt@kms_flip@plain-flip-fb-recreate-interruptible.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_90/fi-kbl-7560u/igt@kms_flip@plain-flip-fb-recreate-interruptible.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_90/fi-cfl-s3/igt@kms_flip@plain-flip-fb-recreate-interruptible.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_92/fi-kbl-r/igt@kms_flip@wf_vblank-ts-check.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_90/fi-skl-6600u/igt@kms_flip@wf_vblank-ts-check.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_90/fi-kbl-r/igt@kms_flip@wf_vblank-ts-check.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_90/fi-kbl-7560u/igt@kms_flip@wf_vblank-ts-check.html (kms_flip:1226) CRITICAL: Test assertion failure function calibrate_ts, file ../tests/kms_flip.c:1129: (kms_flip:1226) CRITICAL: Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0 (kms_flip:1226) CRITICAL: Last errno: 16, Device or resource busy Subtest wf_vblank-ts-check failed.(km Update: Last seen drmtip_104 (1 month, 1 week / 590 runs ago). The frequency of this bug occurrence is in between 1 week to 2 months. A CI Bug Log filter associated to this bug has been updated: {- all machines: igt@kms_flip@* - Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0 -} {+ all machines: igt@kms_flip@* - Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0 +} New failures caught by the filter: * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_192/fi-skl-6600u/igt@kms_flip@plain-flip-fb-recreate.html A CI Bug Log filter associated to this bug has been updated: {- all machines: igt@kms_flip@* - Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0 -} {+ all machines: igt@kms_flip@* - Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0 +} New failures caught by the filter: * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_240/fi-kbl-7560u/igt@kms_flip@2x-plain-flip-fb-recreate.html This is an old one that we haven't seen for 3.5 months. Prior to that the reproduction rate varied quite a bit. Shortly before disappearance it was seen about once every two weeks. This means we can close the bug after 20 weeks ~= 5 months, so around mid August 2019. As of what happens, code explains it pretty well (from kms_flip.c/calibrate_ts()): memset(&wait, 0, sizeof(wait)); wait.request.type = kmstest_get_vbl_flag(crtc_idx); wait.request.type |= DRM_VBLANK_RELATIVE | DRM_VBLANK_NEXTONMISS; do_or_die(drmWaitVBlank(drm_fd, &wait)); last_seq = wait.reply.sequence; last_timestamp = wait.reply.tval_sec; last_timestamp *= 1000000; last_timestamp += wait.reply.tval_usec; memset(&wait, 0, sizeof(wait)); wait.request.type = kmstest_get_vbl_flag(crtc_idx); wait.request.type |= DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT; wait.request.sequence = last_seq; for (n = 0; n < CALIBRATE_TS_STEPS; n++) { drmVBlank check = {}; ++wait.request.sequence; do_or_die(drmWaitVBlank(drm_fd, &wait)); /* Double check that haven't already missed the vblank */ check.request.type = kmstest_get_vbl_flag(crtc_idx); check.request.type |= DRM_VBLANK_RELATIVE; do_or_die(drmWaitVBlank(drm_fd, &check)); igt_assert(!igt_vblank_after(check.reply.sequence, wait.request.sequence)); } So we are waiting for a begining of next vblank to get its seq number (NEXTONMISS), then for the next CALIBRATE_TS_STEPS we are waiting for the very next vblank (DRM_VBLANK_ABSOLUTE, ++wait.request.sequence) and double checking. drmWaitVBlank(drm_fd, &wait) seems to be failing with -EBUSY: do { ret = ioctl(fd, DRM_IOCTL_WAIT_VBLANK, vbl); vbl->request.type &= ~DRM_VBLANK_RELATIVE; if (ret && errno == EINTR) { clock_gettime(CLOCK_MONOTONIC, &cur); /* Timeout after 1s */ if (cur.tv_sec > timeout.tv_sec + 1 || (cur.tv_sec == timeout.tv_sec && cur.tv_nsec >= timeout.tv_nsec)) { errno = EBUSY; ret = -1; break; } } } while (ret && errno == EINTR); out: return ret; Seems like we are missing the vblank with given seq and then bailing out after 1s with EBUSY, which sounds rather serious and may point to a bug in the kernel, as this amount of code should not take to long even with unfavorable scheduling. Let's see whether it reappears, but hopefully it was "accidentally fixed." It's past mid-August already and the bug haven't reappeared :-) Let's close it. The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.