Bug 108303 - [CI][DRMTIP] igt@kms_flip_tiling@flip-(y-tiled|changes-tiling(-yf?)?) - fail - Failed assertion: !mismatch
Summary: [CI][DRMTIP] igt@kms_flip_tiling@flip-(y-tiled|changes-tiling(-yf?)?) - fail ...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 107831 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-10-09 14:44 UTC by Martin Peres
Modified: 2019-11-12 07:09 UTC (History)
3 users (show)

See Also:
i915 platform: BXT, BYT, CFL, GLK, ICL, KBL, SKL, TGL
i915 features: display/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 Martin Peres 2018-10-16 08:04:55 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_128/fi-icl-u2/igt@kms_flip_tiling@flip-y-tiled.html

Starting subtest: flip-Y-tiled
(kms_flip_tiling:1143) igt_debugfs-CRITICAL: Test assertion failure function igt_assert_crc_equal, file ../lib/igt_debugfs.c:392:
(kms_flip_tiling:1143) igt_debugfs-CRITICAL: Failed assertion: !mismatch
Subtest flip-Y-tiled failed.
Comment 2 Martin Peres 2018-11-15 14:10:07 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_142/fi-bxt-dsi/igt@kms_flip_tiling@flip-changes-tiling-y.html

Starting subtest: flip-changes-tiling-Y
(kms_flip_tiling:1504) igt_debugfs-CRITICAL: Test assertion failure function igt_assert_crc_equal, file ../lib/igt_debugfs.c:419:
(kms_flip_tiling:1504) igt_debugfs-CRITICAL: Failed assertion: !mismatch
Subtest flip-changes-tiling-Y failed.
Comment 3 Martin Peres 2018-11-16 15:45:29 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5145/shard-skl6/igt@kms_flip_tiling@flip-changes-tiling.html

Starting subtest: flip-changes-tiling
(kms_flip_tiling:1260) igt_debugfs-CRITICAL: Test assertion failure function igt_assert_crc_equal, file ../lib/igt_debugfs.c:419:
(kms_flip_tiling:1260) igt_debugfs-CRITICAL: Failed assertion: !mismatch
Subtest flip-changes-tiling failed.
Comment 4 Jani Saarinen 2018-12-11 12:42:59 UTC
According to JP might be test issue, could someone confirm?
Comment 5 Juha-Pekka Heikkilä 2018-12-17 19:17:23 UTC
(In reply to Jani Saarinen from comment #4)
> According to JP might be test issue, could someone confirm?

I was briefly looking at this test and notice it is written against old IGT, lot of things are done differently nowadays. I was just earlier pointing out for Jani that IGT has evolved quite much in four years.

Though, if this test suddenly started to fail virtually on every gen>9 platform on 2018-10-09 maybe there is around those days kernel commit which touch something SKL_* that is responsible?
Comment 6 CI Bug Log 2019-01-29 16:32:56 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SKL BXT KBL WHL ICL: igt@kms_flip_tiling@flip-(y-tiled|changes-tiling(-yf?)?) - fail - Failed assertion: !mismatch -}
{+ SKL BXT KBL WHL ICL: igt@kms_flip_tiling@flip-(y-tiled|changes-tiling(-yf?)?) - fail - Failed assertion: !mismatch +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_189/fi-glk-j4005/igt@kms_flip_tiling@flip-y-tiled.html
Comment 7 CI Bug Log 2019-01-29 16:34:34 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SKL BXT KBL WHL ICL: igt@kms_flip_tiling@flip-(y-tiled|changes-tiling(-yf?)?) - fail - Failed assertion: !mismatch -}
{+ SKL BXT KBL GLK WHL ICL: igt@kms_flip_tiling@flip-(y-tiled|changes-tiling(-yf?)?) - fail - Failed assertion: !mismatch +}

 No new failures caught with the new filter
Comment 8 CI Bug Log 2019-02-14 17:38:09 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SKL BXT KBL GLK WHL ICL: igt@kms_flip_tiling@flip-(y-tiled|changes-tiling(-yf?)?) - fail - Failed assertion: !mismatch -}
{+ SKL BXT KBL GLK WHL ICL: igt@kms_flip_tiling@flip-([xy]-tiled|changes-tiling(-yf?)?) - fail - Failed assertion: !mismatch +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_218/fi-icl-u2/igt@kms_flip_tiling@flip-x-tiled.html
Comment 9 Neel 2019-04-01 21:00:04 UTC
(In reply to Juha-Pekka Heikkilä from comment #5)
> (In reply to Jani Saarinen from comment #4)
> > According to JP might be test issue, could someone confirm?
> 
> I was briefly looking at this test and notice it is written against old IGT,
> lot of things are done differently nowadays. I was just earlier pointing out
> for Jani that IGT has evolved quite much in four years.
> 
> Though, if this test suddenly started to fail virtually on every gen>9
> platform on 2018-10-09 maybe there is around those days kernel commit which
> touch something SKL_* that is responsible?

JP, what is your IRC nick?
Comment 10 Jani Saarinen 2019-04-02 06:16:25 UTC
HI, That is Juha reversed, ahuj.
Comment 11 Lakshmi 2019-04-25 13:19:04 UTC
Bumping the priority to high as we the issue on shards as well.
Comment 12 Swati Sharma 2019-04-26 12:12:54 UTC
Ville's patch https://patchwork.freedesktop.org/series/59419/ fixing this issue. Tested on ICL.
Comment 13 Jani Saarinen 2019-05-06 08:25:56 UTC
This series was merged, 

drm/i915: Enable pipe HDR mode on ICL if only HDR planes are used
did this fix issue? 

author	Ville Syrjälä <ville.syrjala@linux.intel.com>	2019-04-12 21:30:09 +0300
committer	Ville Syrjälä <ville.syrjala@linux.intel.com>	2019-04-30 22:14:43 +0300
commit	09b25812db10fcbd7937c1b7ca279c5c0d77ba9d
tree	98ec25ffe9c765c844753a9dec4956a9d1604f56
parent	9b11215e40c5a0aefba9b026543fb0799f61bf6f
Comment 14 Swati Sharma 2019-05-06 11:36:23 UTC
(In reply to Jani Saarinen from comment #13)
> This series was merged, 
> 
> drm/i915: Enable pipe HDR mode on ICL if only HDR planes are used
> did this fix issue? 
> 
> author	Ville Syrjälä <ville.syrjala@linux.intel.com>	2019-04-12 21:30:09
> +0300
> committer	Ville Syrjälä <ville.syrjala@linux.intel.com>	2019-04-30 22:14:43
> +0300
> commit	09b25812db10fcbd7937c1b7ca279c5c0d77ba9d
> tree	98ec25ffe9c765c844753a9dec4956a9d1604f56
> parent	9b11215e40c5a0aefba9b026543fb0799f61bf6f

Tested with kernel 5.1.0-rc7+ having Ville's fix, kms_flip_tiling all sub-tests are passing.
Comment 15 Jani Saarinen 2019-05-07 12:54:53 UTC
So, can we remove ICL from list of platforms?
Comment 16 Lakshmi 2019-05-08 06:30:25 UTC
(In reply to Jani Saarinen from comment #15)
> So, can we remove ICL from list of platforms?

ON ICL, on an average this failure occurs once in 4.4 drmtip runs. Last seen on drmtip_271, so we can remove ICL after drmtip_315 if no new failures are seen on ICL.
Comment 17 Jani Saarinen 2019-05-08 06:34:22 UTC
Lowering priority ok?
Comment 18 Lakshmi 2019-05-13 15:02:07 UTC
(In reply to Jani Saarinen from comment #17)
> Lowering priority ok?
Still occurring on ICL

https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4983/shard-iclb7/igt@kms_flip_tiling@flip-x-tiled.html
Comment 19 Lakshmi 2019-05-31 11:30:17 UTC
This issue is still occurring on SKL and ICL (shards). The reproduction rate of this failure is once in 17 runs (shards) during last one month.

This test is doing flips from one tiling to other(e.g. x to y and vice versa). Failure could mean that corrupted image is displayed.

The impact of the bug is minimal considering the customer use cases are minimal outside of IGT.
Comment 20 Matt Roper 2019-09-04 20:33:10 UTC
*** Bug 107831 has been marked as a duplicate of this bug. ***
Comment 21 Matt Roper 2019-09-04 20:57:29 UTC
igt@kms_flip_tiling@flip-changes-tiling is also failing on BYT; that was previously tracked separately in bug 107831, but based on the output it looks like it's probably the same underlying issue; adding BYT to the platform list here and marking the other one as a duplicate.

As Lakshmi mentioned, this test is trying to test pageflip transitions from one tiling format to another during legacy pageflips.  It does so by allocating two framebuffers, fb0 and fb1.  The test first does a modeset to place fb1 on the screen and takes a reference CRC.  It then does a modeset to put fb0 on the screen (which contains a different image than fb1 so the CRC is unlikely to match at this point).  Finally, it performs a legacy pageflip to switch back to fb1 again and takes another CRC.  This CRC should match the original reference CRC, but the the test is failing because it does not.

One interesting thing to note is that this test uses oversized framebuffers and primary planes (e.g., 2048 wide when running on a 1366 width display) to make it easy to find framebuffer sizes with compatible strides.  That should be perfectly legal and not cause problems (the primary plane will just be clipped back to the size of the pipe), but since it is an unusual situation, that might be a good place to start investigating to see whether we're doing anything wrong in the driver's clipping that might lead to a CRC mismatch.

Another thing to double confirm if someone has a platform to test this on directly would be that we truly are waiting for flip completion properly and not just assuming that fb1 will be back on the screen the next vblank after we submit the flip.  The code for that looks correct to me (we don't ask for vblank events so pageflip completions should be the only thing we can read from the DRM filehandle), but if there's something I'm overlooking and we're actually taking the CRC after a vblank event rather than a pageflip completion event, then we could run into problems if the pageflip were submitted during the vblank evasion window (i.e., the driver would wait for a vblank to complete before actually updating the plane registers, and the new framebuffer wouldn't actually be visible on screen until the next vblank after that).  Again, the code looks correct to me, but it would be good for someone with actual hardware to confirm that this definitely isn't the source of the problem.

It would also probably be good to update this test to include atomic variants of these tests.  It would be a useful data point to know whether we get the same kind of failures for flips submitted via the atomic API as through the legacy pageflip API.

As Lakshmi mentioned, the impact of this failure should be relatively low on real-world use cases.  This test is more focused on ensuring we're covering the corner cases of the ABI rather than modeling a situation that comes up frequently in real-world usage.  That's especially true given that this test is still using the legacy pageflip ioctl rather than performing atomic updates like a lot of modern userspace would.
Comment 22 CI Bug Log 2019-11-12 07:09:44 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* TGL: igt@kms_flip_tiling@flip-x-tiled - fail - Failed assertion: !mismatch || igt_skip_crc_compare
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5268/shard-tglb5/igt@kms_flip_tiling@flip-x-tiled.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7300/shard-tglb5/igt@kms_flip_tiling@flip-x-tiled.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_5305/shard-tglb7/igt@kms_flip_tiling@flip-x-tiled.html


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.