Starting subtest: fbcpsr-2p-rte
(kms_frontbuffer_tracking:968) igt_kms-CRITICAL: Test assertion failure function igt_drm_plane_commit, file ../lib/igt_kms.c:2669:
(kms_frontbuffer_tracking:968) igt_kms-CRITICAL: Failed assertion: ret == 0
(kms_frontbuffer_tracking:968) igt_kms-CRITICAL: Last errno: 22, Invalid argument
(kms_frontbuffer_tracking:968) igt_kms-CRITICAL: error: -22 != 0
The CI Bug Log issue associated to this bug has been updated.
### New filters associated
* TGL: igt@kms_frontbuffer_tracking@*2p* - fail - Failed assertion: ret == 0
<7> [575.504041] [drm:intel_bw_atomic_check [i915]] Bandwidth 12002 MB/s exceeds max available 9452 MB/s (4 active planes)
The failure specifically happens in the "2p" variants (i.e., the ones that use two pipes) and the "rte" subtests (i.e., the basic sanity tests to make sure that the general features like fbc/psr are operating properly). These subtests don't even do anything fancy; they just turn everything off, turn on both pipes, turn on the relevant planes, and enable the fbc/psr feature and make sure nothing fails.
It looks like the test is driving a 3840x2160 mode on pipe A (eDP) and a 5120x2880 mode on pipe B (DP). In addition to the primary plane, the cursor (64x64) and one sprite (64x64) are turned on at the point we're failing due to memory bandwidth. Our driver ignores cursor planes for the purpose of memory bandwidth calculations, so we can just focus on the primary and sprite planes. The CRTC clock for pipe A is 533250 so
Pipe A primary data rate: 533250 * 4 = 2,133,000
Pipe A sprite data rate: 533250 * 4 = 2,133,000
and the CRTC clock for pipe B is 967000 so
Pipe B primary data rate: 967000 * 4 = 3,868,000
Pipe B sprite data rate: 967000 * 4 = 3,868,000
for a total data rate of 12,002,000.
According to the bspec, plane data rates are based on the CRTC clock and are independent of the size of the plane itself, so we're charged as much for a tiny 64x64 plane as we are for a full-screen plane; this computation seems questionable (especially since we get to ignore cursor planes of the same size), but that's what the bspec tells us to do. So the total required data rate here looks reasonable if we trust the bspec algorithm.
As for the platform max bandwidth available, that's based on QGV data we retrieve from the pcode. I know Stanislav has been working on the SAGV stuff, so it's possible that some of our current driver calculations are off for TGL and his series will fix that (adding him to the Cc list in case he has additional insight).
Assuming the QGV calculations in the driver are already correct even without Stanislav's in-flight series, then the failure here looks legitimate; our IGT test is just asking for a configuration that exceeds the platform's capabilities. Using a lower display mode would avoid this problem. We could also potentially add a debugfs interface to notify IGT when something fails because of memory bandwidth (rather than because of a failure in whatever a test is trying to exercise) and make IGT drop back to smaller modes when it detects this kind of condition. Similar debugfs feedback + IGT fallback could also be applied to other kinds of things like watermark/DDB failures, link training failures, etc.