Starting from drmtip_50, we got over a thousand "*ERROR* Potential atomic update failure on pipe *" in dmesg. This is a pretty serious regression :s https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_50/fi-cfl-s3/igt@kms_cursor_crc@cursor-128x42-offscreen.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_50/fi-cfl-u/igt@kms_cursor_crc@cursor-256x256-suspend.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_50/fi-kbl-7560u/igt@perf_pmu@idle-no-semaphores-bcs0.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_50/fi-kbl-r/igt@perf_pmu@idle-no-semaphores-bcs0.html [ 158.366252] [drm:intel_pipe_update_start [i915]] *ERROR* Potential atomic update failure on pipe A
Differences in drm-misc-fixes: $ git shortlog 2b6207291b7b277a5df9d1aab44b56815a292dba...2bc5ff0bdc00d81d719dad74589317a260d583ed Dhinakaran Pandiyan (1): drm/psr: Fix missed entry in PSR setup time table. Tomi Valkeinen (1): drm/omap: fix NULL deref crash with SDI displays Differences in drm-intel-fixes: $ git shortlog 771c577c23bac90597c685971d7297ea00f99d11...57ebdafc306af9decd893b4cb11bd834a7e27ed1 Chris Wilson (2): drm/i915/lvds: Move acpi lid notification registration to registration phase drm/i915/query: Protect tainted function pointer lookup Ondrej Zary (1): drm/i915: Disable LVDS on Radiant P845 Ville Syrjälä (1): drm/i915: Restore planes after load detection Differences in drm-misc-next: $ git shortlog 3c5f134ac9d0e405a15af652c3ce8cbaa9bf1bc7...2edd4e698dc8a0c497a502c75561c87be0e8a9a6 Chris Wilson (4): drm/mm: Reject over-sized allocation requests early drm/mm: Add a search-by-address variant to only inspect a single hole drm/i915: Limit searching for PIN_HIGH drm/i915: Pin the ring high Souptick Joarder (1): gpu: drm: vgem: Change return type to vm_fault_t Differences in drm-intel-next-queued: $ git shortlog c894d63c6b36de20f0248d88801be5ace8e6bee8...09a4c02e58c1b3d9748f78242962b7f63c68477e Chris Wilson (1): drm/i915: Look for an active kernel context before switching Dhinakaran Pandiyan (6): drm/i915/psr: Nuke PSR support for VLV and CHV drm/i915/psr: Avoid DPCD reads when panel does not support PSR drm/i915/psr: Check for SET_POWER_CAPABLE bit at PSR init time. drm/i915/psr: Avoid unnecessary DPCD read of DP_PSR_CAPS drm/i915/psr: Fall back to max. synchronization latency if DPCD read fails drm/i915/psr: Fix ALPM cap check for PSR2 Vathsala Nagaraju (1): drm/i915/psr: vbt change for psr Yunwei Zhang (3): drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads drm/i915/icl: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads drm/i915: Implement WaProgramMgsrForL3BankSpecificMmioReads Are the failures on PSR capable systems by any chance?
Seems to be the case, all the systems have PSR in common. Does the issue trigger on drm-intel-next-queued runs as well?
We don't really know, because shards don't have PSR panels, and there is no "shard-run" for anything else than DRM-Tip.
Trimmed list of suspects based on gut feeling. (In reply to Maarten Lankhorst from comment #1) > Dhinakaran Pandiyan (1): > drm/psr: Fix missed entry in PSR setup time table. > > Ville Syrjälä (1): > drm/i915: Restore planes after load detection > > Dhinakaran Pandiyan (6): > drm/i915/psr: Nuke PSR support for VLV and CHV > drm/i915/psr: Avoid DPCD reads when panel does not support PSR > drm/i915/psr: Check for SET_POWER_CAPABLE bit at PSR init time. > drm/i915/psr: Avoid unnecessary DPCD read of DP_PSR_CAPS > drm/i915/psr: Fall back to max. synchronization latency if DPCD read > fails > drm/i915/psr: Fix ALPM cap check for PSR2 > > Vathsala Nagaraju (1): > drm/i915/psr: vbt change for psr
(In reply to Martin Peres from comment #0) > Starting from drmtip_50, @martin, What kind of runs are these? I don't see frontbuffer_tracking: fbcpsr* tests being part of fastfeedback. And isn't the full suite executed only on shards? we got over a thousand "*ERROR* Potential atomic > update failure on pipe *" in dmesg. This is a pretty serious regression :s >
(In reply to Dhinakaran Pandiyan from comment #5) > (In reply to Martin Peres from comment #0) > > Starting from drmtip_50, > > @martin, > What kind of runs are these? I don't see frontbuffer_tracking: fbcpsr* tests > being part of fastfeedback. And isn't the full suite executed only on shards? These are the runs with the shards machine's testlist (CI-Full), but executed during the idle time of all the other machines. We get about 4 to 6 of these runs per week.
"drm/i915/psr: vbt change for psr" changed the exit link training time from 500 us to 2.5 ms on these machines. The frame counter is possibly stuck for a longer duration now and pipe_update_start() is not aware that the counter is stuck and warns. We've been discussing this problem for some time now and the VBT change appears to have made it more likely to occur. Related discussion can be found in the April email archives under: "[Intel-gfx] [RFC] drm/i915: Rework "Potential atomic update error" to handle PSR exit" I wish this was caught in pre-merge instead of these drm-tip runs.
There was not PSR panels on shards.
(In reply to Dhinakaran Pandiyan from comment #7) > "drm/i915/psr: vbt change for psr" changed the exit link training time from > 500 us to 2.5 ms on these machines. The frame counter is possibly stuck for > a longer duration now and pipe_update_start() is not aware that the counter > is stuck and warns. > > We've been discussing this problem for some time now and the VBT change > appears to have made it more likely to occur. > > Related discussion can be found in the April email archives under: > "[Intel-gfx] [RFC] drm/i915: Rework "Potential atomic update error" to > handle PSR exit" > > I wish this was caught in pre-merge instead of these drm-tip runs. The failure is still happening... If making a patch to fix this issue is taking too long, why has this patch not been reverted yet? We need to be more aggressive at keeping the bug count low...
Fixes merged to drm-tip: c3d433617d20 drm/i915: Use crtc_state->has_psr instead of CAN_PSR for pipe update a608987970b9 drm/i915: Wait for PSR exit before checking for vblank evasion Marking the bug resolved, please re-open if there are "Potential atomic update failure on pipe A" errors on *PSR* machines.
Martin, OK to close?
Not seen in cibuglogger, closing.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.