A lot of tests generate the following dmesg warning on the machine fi-bxt-dsi (Broxton with a DSI panel attached) when running IGT: [ 522.912705] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=21 end=22) time 186 us, min 1192, max 1199, scanline start 1, end 1 Full logs: - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@pm_rps@basic-api.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@prime_vgem@basic-busy-default.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@gem_exec_flush@basic-wb-ro-before-default.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@gem_ctx_switch@basic-default.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@gem_exec_reloc@basic-write-read-noreloc.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@gem_pread@basic.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@gem_wait@basic-wait-all.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@gem_busy@basic-busy-default.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@gem_ringfill@basic-default-fd.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@gem_exec_reloc@basic-gtt-read-noreloc.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@drv_module_reload@basic-no-display.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3001/fi-bxt-dsi/igt@kms_pipe_crc_basic@read-crc-pipe-a.html
This bug happens randomly on all tests, making this bug a blocker for CI.
The suggestion was that PIPEDSL() doesn't work on DSI, and always returns 0 (bumped to 1 by crtc->scanline_offset). That means that vblank avoidance is non-existent.
Bspec topic 7652 on PIPEDSL (or PIPE_SCANLINE) says, "Not supported with MIPI DSI".
For gen9 platforms, dsi timings are driven from port instead of pipe (unlike ddi). Thus, we can't rely on pipe registers to get the timing information. Even scanline register read will not be functional. This is causing vblank evasion logic to fail since it relies on scanline, causing atomic update failure warnings. We created a patch which uses pipe framestamp and current timestamp registers to calculate scanline. This is an indirect way to get the scanline. It helps resolve atomic update failure for gen9 dsi platforms. https://patchwork.kernel.org/patch/9944249/
Also seen still that patches on trybot fixes issues seen: https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_1167/
(In reply to Uma Shankar from comment #4) > https://patchwork.kernel.org/patch/9944249/ Side note, please use the patchwork instance at https://patchwork.freedesktop.org/project/intel-gfx only. We have all the CI results etc. there. Plus you can reference patches using the message-id: http://patchwork.freedesktop.org/patch/msgid/1506336810-3706-1-git-send-email-vidya.srinivas@intel.com
Patch merged: https://cgit.freedesktop.org/drm-tip/commit/?id=aec0246f3e3882065b5c29916a84b539afe4e4af CI also green on that run. Resolving and whitelisting machine on CI.
The issue is back but this time only on one test. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3180/fi-bxt-dsi/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
We executed the test in a loop for around 4 hours with a delay of 5 sec between each iteration. We couldn't reproduce this issue on our end. Is there any specific scenario (BKM) to reproduce it? Does it happen consistently or its one-off/random kind of an issue? We tested with below configuration: DSI APL with AUO panel (1920x1200p) branch drm-tip top commit: commit 247cb84af034b8e90ecd22cd69adb13a7a305350 Author: Jani Nikula <jani.nikula@intel.com> Date: Thu Oct 5 09:44:50 2017 +0300 drm-tip: 2017y-10m-05d-06h-44m-09s UTC integration manifest As per our understanding, this particular issue is not related to the evasion issues caused by scanline reporting and is more of a modeset/flip sequence/locking problem related to cursor. So, though the error message is similar, the cause for the same is different. Hence it will require a separate debug. Note: There have been some fixes related to cursor updates by Maarten. Can Maarten also have a look at it in parallel as to why this cursor test is failing? Any pointers would be helpful based on the past debug.
Could you try with the debug kernel then? It will more likely cause timeouts because lockdep makes it slow. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3180/kernel.config.bz2
We tested with drm.debug=0x2e command line parameters (all logs were enabled). We are not able to reproduce the issue.
*** Bug 103020 has been marked as a duplicate of this bug. ***
Lowering as not really blocker anymore.
The issue hasn't been seen since CI_DRM_3180: 2017-10-05
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.