Bug 102373

Summary: [CI][SNB] *ERROR* CPU pipe [A|B] FIFO underrun
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Maarten Lankhorst <bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs, mahesh1.kumar, pedrib
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: SNB i915 features: display/Other

Description Martin Peres 2017-08-23 13:22:45 UTC
On Sandybridge, we see CPU pipe B FIFO underruns when running the following tests:
 - igt@kms_flip@plain-flip-ts-check-interruptible
 - igt@kms_flip@dpms-vs-vblank-race
 - igt@kms_cursor_crc@cursor-128x128-random

[   73.318726] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
[   73.318873] [drm:intel_set_pch_fifo_underrun_reporting [i915]] *ERROR* uncleared pch fifo underrun on pch transcoder B
[   73.318912] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH transcoder B FIFO underrun

Full logs:
- https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_2994/shard-snb6/igt@kms_flip@plain-flip-ts-check-interruptible.html
- https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_2992/shard-snb1/igt@kms_flip@dpms-vs-vblank-race.html
- https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_2994/shard-snb5/igt@kms_cursor_crc@cursor-128x128-random.html
Comment 2 Marta Löfstedt 2017-09-19 13:33:41 UTC
*** Bug 102371 has been marked as a duplicate of this bug. ***
Comment 3 Jani Saarinen 2017-09-27 12:31:51 UTC
According to Maarten same fix should help on here too:
https://bugs.freedesktop.org/show_bug.cgi?id=102675
Comment 4 Maarten Lankhorst 2017-10-04 11:22:15 UTC
Is this fixed by commit 3cf50c63a76177e0bbe0e46e1abe4eb263128ba4 ?
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Tue Sep 19 14:14:18 2017 +0200

    drm/i915: Unset legacy_cursor_update early in intel_atomic_commit, v3.
Comment 5 Marta Löfstedt 2017-10-05 10:08:02 UTC
Maartens fix doesn't cover the full FIFO underrun issue on SNB-shards. 
See for example:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3179/shard-snb4/igt@kms_properties@crtc-properties-legacy.html
Comment 6 Marta Löfstedt 2017-10-09 11:53:52 UTC
Note:
for these tests:
igt@kms_busy@extended-modeset-hang-newfb-render-A
igt@kms_busy@extended-modeset-hang-newfb-render-B
igt@kms_busy@extended-modeset-hang-newfb-with-reset-render-A
igt@kms_busy@extended-modeset-hang-newfb-with-reset-render-B

the reproduction rate of FIFO underruns appear to be 100% on the SNB-shards.
Comment 7 Marta Löfstedt 2017-10-16 12:34:35 UTC
There is also a trend that started at CI_DRM_3227 from that:
igt@kms_flip@vblank-vs-hang-interruptible
igt@kms_mmio_vs_cs_flip@setplane_vs_cs_flip

have failed all times:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3227/shard-snb2/igt@kms_flip@vblank-vs-hang-interruptible.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3227/shard-snb4/igt@kms_mmio_vs_cs_flip@setplane_vs_cs_flip.html
Comment 8 Marta Löfstedt 2017-10-17 12:40:16 UTC
(In reply to Marta Löfstedt from comment #7)
> There is also a trend that started at CI_DRM_3227 from that:
> igt@kms_flip@vblank-vs-hang-interruptible
> igt@kms_mmio_vs_cs_flip@setplane_vs_cs_flip
> 
> have failed all times:
> 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3227/shard-snb2/
> igt@kms_flip@vblank-vs-hang-interruptible.html
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3227/shard-snb4/
> igt@kms_mmio_vs_cs_flip@setplane_vs_cs_flip.html

This trend appear to have been stopped from: CI_DRM_3239
Comment 9 Marta Löfstedt 2017-10-17 13:14:25 UTC
A new trend of 100% reproducibility on:
igt@kms_busy@extended-pageflip-hang-oldfb-render-B and 
igt@kms_cursor_crc@cursor-128x128-dpms

appear to have started at CI_DRM_3242:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3242/shard-snb5/igt@kms_busy@extended-pageflip-hang-oldfb-render-B.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3242/shard-snb4/igt@kms_cursor_crc@cursor-128x128-dpms.html

and is stable at hitting FIFO underrun until at least CI_DRM_3249.

I just noticed that igt@kms_busy@extended-pageflip-hang-oldfb-render-B and 
igt@kms_cursor_crc@cursor-128x128-dpms are scheduled in the same shard for all those runs. The re-randomization of which test goes into which shard is only done  when IGT is rebuilt. 

Note, for each CI_DRM_NNNN there is a shards.html, where easily can see which tests that are in which shards, example: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3246/shards.html

So, there is potentially some context leak causing this grouping of FIFO underruns being more probable when tests are run in the same shard.
Comment 10 Maarten Lankhorst 2017-10-18 09:33:32 UTC
https://patchwork.freedesktop.org/series/32197/
Comment 11 Maarten Lankhorst 2017-10-25 13:15:44 UTC
*** Bug 103376 has been marked as a duplicate of this bug. ***
Comment 12 Jani Saarinen 2017-10-27 05:38:04 UTC
both patches are now rb'd'
Comment 13 Maarten Lankhorst 2017-10-27 07:24:14 UTC
commit b6b178a77210055b153dbc175e4468bd3c7122df (HEAD -> drm-intel-next-queued, drm-intel/for-linux-next, drm-intel/drm-intel-next-queued)
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Thu Oct 19 17:13:41 2017 +0200

    drm/i915: Calculate ironlake intermediate watermarks correctly, v2.

commit 28283f4f359cd7cfa9e65457bb98c507a2cd0cd0                                                                                                                                                                                                                                
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>                                                                                                                                                                                                                  
Date:   Thu Oct 19 17:13:40 2017 +0200                                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                                               
    drm/i915: Do not rely on wm preservation for ILK watermarks
Comment 14 Marta Löfstedt 2017-10-27 12:17:57 UTC
Patches integrated to CI_DRM_3289,

this will be interesting to follow. As long as I have been filing cibuglog I have averaged about 1 new occurrence of FIFO underrun on SNB a day. So, I have to wait a while before closing,

anyways:
igt@kms_busy@extended-modeset-hang-newfb-render-A
igt@kms_busy@extended-modeset-hang-newfb-render-B
igt@kms_busy@extended-modeset-hang-newfb-with-reset-render-A
igt@kms_busy@extended-modeset-hang-newfb-with-reset-render-B

are now green on CI_DRM_3289.
Comment 15 Marta Löfstedt 2017-11-01 07:35:45 UTC
There has been on FIFO underruns on SNB since CI_DRM_3289 until CI_DRM_3304, I am closing this bug. Thanks for the fix Maarten.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.