(kms_flip:7832) CRITICAL: Test assertion failure function run_test_step, file kms_flip.c:709: (kms_flip:7832) CRITICAL: Failed assertion: gettime_us() - start < 500 (kms_flip:7832) CRITICAL: Last errno: 25, Inappropriate ioctl for device Subtest flip-vs-expired-vblank failed. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3874/shard-snb3/igt@kms_flip@flip-vs-expired-vblank.html
Platform is set as ALL as this failure occurs in ALL machines.
Also happening in shards, bumping the priority!
It seems that we exceed the timeout when switching to test another pipe i.e. from pipe A to pipe B flip-vs-expired-vblank-interruptible on pipe A, connector HDMI-A-1: PASSED (kms_flip:7192) INFO: Beginning flip-vs-expired-vblank-interruptible on pipe B, connector HDMI-A-1 (kms_flip:7192) igt_fb-DEBUG: igt_create_fb_with_bo_size(width=1920, height=1080, format=XR24(0x34325258), modifier=0x0, size=0) (kms_flip:7192) igt_fb-DEBUG: igt_create_fb_with_bo_size(handle=2, pitch=7680) (kms_flip:7192) ioctl_wrappers-DEBUG: Test requirement passed: igt_has_fb_modifiers(fd) (kms_flip:7192) igt_fb-DEBUG: igt_create_fb_with_bo_size(width=1920, height=1080, format=XR24(0x34325258), modifier=0x0, size=0) (kms_flip:7192) igt_fb-DEBUG: igt_create_fb_with_bo_size(handle=3, pitch=7680) (kms_flip:7192) ioctl_wrappers-DEBUG: Test requirement passed: igt_has_fb_modifiers(fd) (kms_flip:7192) igt_fb-DEBUG: Test requirement passed: cairo_surface_status(fb->cairo_surface) == CAIRO_STATUS_SUCCESS (kms_flip:7192) igt_fb-DEBUG: Test requirement passed: cairo_surface_status(fb->cairo_surface) == CAIRO_STATUS_SUCCESS (kms_flip:7192) igt_kms-INFO: 1920x1080 60 1920 2008 2052 2200 1080 1083 1088 1125 0x9 0x48 148500 (kms_flip:7192) DEBUG: name = flip last_ts = 0.000000 last_received_ts = 0.000000 last_seq = 0 current_ts = 1611.736816 current_received_ts = 1611.736206 current_seq = 65053 count = 0 seq_step = 1 (kms_flip:7192) DEBUG: name = flip The last_ts is initialized as 0. We grab the current_ts which is within normal range. However, the difference between these two values exceed the test limit of 500.
The test is making sure that blocking wait on a expired vblank (that has already happened, so the current_seq == requested_seq) should be "immediate", i.e. taking no longer than 500us. This tests fails in all its variants, 2x, interruptible, non-interruptible. 500us seems like a plenty of time, but I have especially bad intuition when it comes to time scales. I would advise extending logging to that we igt_debug() the time it takes for a successful and failed wait, so we can inspect the time scale, and confirm that 500us is good enough. Another culprit may be the method of getting the current time and type casting done along the way. Reading both the kernel handler for the IOCTL and libdrm wrapper I have no idea where we could waste time other than some wonky OS scheduling / context switching. User impact: hard to tell, without more information on the failed waits, but should be no more than "medium", assuming they are not taking forever.
Posted a patch to improve logging: https://patchwork.freedesktop.org/patch/313695/
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * all machines: igt@kms_flip@(2x-)flip-vs-expired-vblank(-interruptible) - Failed assertion: end - start < 500 (No new failures associated)
commit 15ad664534413628f06c0f172aac11598bfdb895 Author: Simon Ser <simon.ser@intel.com> Date: Fri Jun 14 13:52:23 2019 +0300 tests/kms_flip: improve logging when timing out Signed-off-by: Simon Ser <simon.ser@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105363 Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
(kms_flip:1523) DEBUG: Vblank took 78us (kms_flip:1523) DEBUG: Vblank took 225us (kms_flip:1523) DEBUG: Vblank took 95us (kms_flip:1523) DEBUG: Vblank took 202us (kms_flip:1523) DEBUG: Vblank took 78us (kms_flip:1523) DEBUG: Vblank took 77us (kms_flip:1523) DEBUG: Vblank took 47us (kms_flip:1523) DEBUG: Vblank took 78us (kms_flip:1523) DEBUG: Vblank took 77us (kms_flip:1523) DEBUG: Vblank took 80us (kms_flip:1523) DEBUG: Vblank took 79us (kms_flip:1523) DEBUG: Vblank took 78us (kms_flip:1523) DEBUG: Vblank took 61us (kms_flip:1523) DEBUG: Vblank took 78us (kms_flip:1523) DEBUG: Vblank took 80us (kms_flip:1523) DEBUG: Vblank took 525us (kms_flip:1523) CRITICAL: Test assertion failure function run_test_step, file ../tests/kms_flip.c:698: (kms_flip:1523) CRITICAL: Failed assertion: end - start < 500
Few notes: 1. The message could be improved. It seems to suggest that this is interval between vblanks instead of the time it takes doing a wait on expired one 2. Going through recent logs the time is usually ~25us with occasional spikes to ~150us. And there are the outliers which we fail on: (kms_flip:1816) DEBUG: Vblank took 1456us (kms_flip:5124) DEBUG: Vblank took 1088us (kms_flip:1373) DEBUG: Vblank took 501us (kms_flip:2150) DEBUG: Vblank took 842us (kms_flip:5198) DEBUG: Vblank took 3301us Still no idea what may cause this. It seems to be too much of a lag even for very unfavourable scheduling. The only locking that happens on the call path is in drm_vblank_get() which takes on a few spinlocks. I believe it's worth investigating. 3ms of delay for 16ms frames (60fps) is a lot.
Based on my free time, would try to look into this issue
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/79.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.