Bug 105312 - [CI][SHARDS] igt@kms_flip@*(flip|wf_vblank_ts_check)* - fail - Failed assertion: elapsed >= min && elapsed <= max
Summary: [CI][SHARDS] igt@kms_flip@*(flip|wf_vblank_ts_check)* - fail - Failed asserti...
Status: RESOLVED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-01 13:37 UTC by Martin Peres
Modified: 2019-07-31 06:46 UTC (History)
1 user (show)

See Also:
i915 platform: ALL
i915 features: display/Other


Attachments

Description Martin Peres 2018-03-01 13:37:22 UTC
(kms_flip:1511) CRITICAL: Test assertion failure function check_final_state, file kms_flip.c:1022:
(kms_flip:1511) CRITICAL: Failed assertion: elapsed >= min && elapsed <= max
(kms_flip:1511) CRITICAL: Last errno: 25, Inappropriate ioctl for device
(kms_flip:1511) CRITICAL: dropped frames, expected 600, counted 598, encoder type 2
Subtest plain-flip-fb-recreate failed.

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3845/shard-kbl2/igt@kms_flip@plain-flip-fb-recreate.html
Comment 1 Martin Peres 2018-04-13 10:11:39 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_8/fi-bdw-gvtdvm/igt@kms_flip@plain-flip-ts-check-interruptible.html

(kms_flip:2328) CRITICAL: Test assertion failure function check_final_state, file ../tests/kms_flip.c:1022:
(kms_flip:2328) CRITICAL: Failed assertion: elapsed >= min && elapsed <= max
(kms_flip:2328) CRITICAL: Last errno: 4, Interrupted system call
(kms_flip:2328) CRITICAL: dropped frames, expected 601, counted 600, encoder type 2
Subtest plain-flip-ts-check-interruptible failed.
Comment 2 Martin Peres 2018-04-18 09:15:05 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4060/shard-glk3/igt@kms_flip@flip-vs-wf_vblank-interruptible.html

(kms_flip:1201) CRITICAL: Test assertion failure function check_final_state, file ../tests/kms_flip.c:1022:
(kms_flip:1201) CRITICAL: Failed assertion: elapsed >= min && elapsed <= max
(kms_flip:1201) CRITICAL: Last errno: 4, Interrupted system call
(kms_flip:1201) CRITICAL: dropped frames, expected 201, counted 170, encoder type 2
Subtest flip-vs-wf_vblank-interruptible failed.
Comment 3 Martin Peres 2018-04-18 09:37:48 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4060/shard-glk3/igt@kms_flip@flip-vs-wf_vblank-interruptible.html

(kms_flip:1201) CRITICAL: Test assertion failure function check_final_state, file ../tests/kms_flip.c:1022:
(kms_flip:1201) CRITICAL: Failed assertion: elapsed >= min && elapsed <= max
(kms_flip:1201) CRITICAL: Last errno: 4, Interrupted system call
(kms_flip:1201) CRITICAL: dropped frames, expected 201, counted 170, encoder type 2
Subtest flip-vs-wf_vblank-interruptible failed.
Comment 4 Martin Peres 2018-04-20 12:31:49 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_23/fi-byt-j1900/igt@kms_flip@flip-vs-wf_vblank-interruptible.html

(kms_flip:1295) CRITICAL: Test assertion failure function check_final_state, file ../tests/kms_flip.c:1022:
(kms_flip:1295) CRITICAL: Failed assertion: elapsed >= min && elapsed <= max
(kms_flip:1295) CRITICAL: Last errno: 4, Interrupted system call
(kms_flip:1295) CRITICAL: dropped frames, expected 157, counted 140, encoder type 2
Subtest flip-vs-wf_vblank-interruptible failed.
Comment 7 Martin Peres 2018-04-25 09:49:17 UTC
Also seen on APL: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4090/shard-apl2/igt@kms_flip@flip-vs-wf_vblank-interruptible.html

(kms_flip:3138) CRITICAL: Test assertion failure function check_final_state, file ../tests/kms_flip.c:1022:
(kms_flip:3138) CRITICAL: Failed assertion: elapsed >= min && elapsed <= max
(kms_flip:3138) CRITICAL: Last errno: 4, Interrupted system call
(kms_flip:3138) CRITICAL: dropped frames, expected 226, counted 140, encoder type 2
Subtest flip-vs-wf_vblank-interruptible failed.
Comment 8 Martin Peres 2018-06-19 21:57:23 UTC
It was super reproducible, and nothing since CI_DRM_4176_full (1 month / 651 runs ago). Closing now!
Comment 9 Francesco Balestrieri 2018-07-02 06:42:04 UTC
Closing for real
Comment 10 Martin Peres 2018-07-13 12:59:58 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4441/shard-glk6/igt@kms_flip@2x-plain-flip-ts-check.html

(kms_flip:1765) CRITICAL: Test assertion failure function check_final_state, file ../tests/kms_flip.c:1022:
(kms_flip:1765) CRITICAL: Failed assertion: elapsed >= min && elapsed <= max
(kms_flip:1765) CRITICAL: Last errno: 25, Inappropriate ioctl for device
(kms_flip:1765) CRITICAL: dropped frames, expected 601, counted 600, encoder type 2
Subtest 2x-plain-flip-ts-check failed.
Comment 11 Lakshmi 2018-10-19 15:26:31 UTC
Update: Last seen CI_DRM_4600_full (2 months, 2 weeks / 1045 runs ago).
The frequency of this bug is between 1 week to 2 months.
Comment 12 Martin Peres 2018-11-01 16:44:27 UTC
(In reply to Lakshmi from comment #11)
> Update: Last seen CI_DRM_4600_full (2 months, 2 weeks / 1045 runs ago).
> The frequency of this bug is between 1 week to 2 months.

The average was once every 4 days over a period of 5 months. It's been 3 months without an occurrence, let's close it!
Comment 13 Martin Peres 2019-01-28 15:22:11 UTC
It's back!

https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4792/shard-skl1/igt@kms_flip@plain-flip-fb-recreate.html

Starting subtest: plain-flip-fb-recreate
(kms_flip:1196) CRITICAL: Test assertion failure function check_final_state, file ../tests/kms_flip.c:1025:
(kms_flip:1196) CRITICAL: Failed assertion: elapsed >= min && elapsed <= max
(kms_flip:1196) CRITICAL: Last errno: 25, Inappropriate ioctl for device
(kms_flip:1196) CRITICAL: dropped frames, expected 600, counted 599, encoder type 2
Comment 14 Martin Peres 2019-04-23 11:50:39 UTC
Also seen on shards. Bumping the priority!
Comment 15 Mika Kahola 2019-05-31 10:10:43 UTC
It seems that we are still seeing these issues every now and then.

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6162/shard-skl10/igt@kms_flip@plain-flip-fb-recreate-interruptible.html
Comment 16 Arek Hiler 2019-06-17 14:01:37 UTC
This bug has extremely low rate of reproduction - we see it about once every
quarter. From the historical diggings it seems that in the last year or so we
missed a single frame each failure.

This family of failing tests makes sure that we managed to squeeze given number
of frames in a given interval of time.

The test does vblank interval measurement/calibration in calibrate_ts(). Some
statistical checks on the timing samples are performed to make sure that 99.7%
samples withing 0.5% of the means and that 94% of samples are within 0.5% of the
expected value.

Every so often we will an event where we hit more intervals on the bad side and
the rounding done on the timeouts (we don't have any leeway here) won't be in
our favour.

Statistical character of this test and non-deterministic timings means we will
fail from time to time.

Impact on the users: low (a frame a quater on the whole farm in delays), as long
as the reproduction rate and number of missed frames stays the same.
Comment 17 Arek Hiler 2019-06-18 06:39:07 UTC
Having the history a bit de-cluttered by ICLs being seemingly fixed we can take a look
at the new landscape of failures. There are two machines that are the most consistent
with failing (every [[idle run]]).


[[fi-hsw-4770r]]:
connector 76: type DP-1, status: connected
        physical dimensions: 0x0mm
        subpixel order: Unknown
        CEA rev: 0
        DPCD rev: 11
        audio support: no
        DP branch device present: yes
                Type: VGA
                ID: DpVga
                HW: 5.6
                SW: 1.48
        modes:
                "1024x768": 60 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa
                "800x600": 60 40000 800 840 968 1056 600 601 605 628 0x40 0x5
                "800x600": 56 36000 800 824 896 1024 600 601 603 625 0x40 0x5
                "848x480": 60 33750 848 864 976 1088 480 486 494 517 0x40 0x5
                "640x480": 60 25175 640 656 752 800 480 490 492 525 0x40 0xa

This one seems to have VGA dongle connected but it does not have native VGA.
Because of that the test fails to realize its VGA and does not ignore it.
The dongle is the most naive one possible, just a few pins connected through
resistors, so the EDID that DRM sees is faked by the DpVga HW to reflect,
the default VGA modes. There is nothing on the other side when using i2c
directly. The test has to be fixed so that it is aware of DpVga.


[[fi-skl-lmem]]:
connector 86: type DP-1, status: connected
        physical dimensions: 0x0mm
        subpixel order: Unknown
        CEA rev: 0
        DPCD rev: 12
        audio support: no
        DP branch device present: yes
                Type: HDMI
                ID: 175IB0
                HW: 1.0
                SW: 7.32
                Max TMDS clock: 600000 kHz
                Max bpc: 12
        modes:
                "1024x768": 60 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa
                "800x600": 60 40000 800 840 968 1056 600 601 605 628 0x40 0x5
                "800x600": 56 36000 800 824 896 1024 600 601 603 625 0x40 0x5
                "848x480": 60 33750 848 864 976 1088 480 486 494 517 0x40 0x5
                "640x480": 60 25175 640 656 752 800 480 490 492 525 0x40 0xa

This one has a HDMI dummy connected to non-native (LSPcon) HDMI on the board.
The suspicious thing here are the modes - they are the default VGA ones.
The HDMI dummy may be faulty one and fails to talk i2c and the LSPcon HW is
faking an EDID. I would advice replacing the dongle.


[idle run]:     https://intel-gfx-ci.01.org/#idle-runs
[fi-hsw-4770r]: https://intel-gfx-ci.01.org/hardware.html#fi-hsw-4770r
[fi-skl-lmem]:  https://intel-gfx-ci.01.org/hardware.html#fi-skl-lmem
Comment 18 Arek Hiler 2019-06-18 06:41:47 UTC
Please disregard the last comment (#17). I have posted it in a wrong bug.
Comment 19 Lakshmi 2019-07-31 06:46:36 UTC
The reproduction rate of this issue is once in 69.2 CI_DRM runs. Last seen on CI_DRM is CI_DRM_5609_full and current run is CI_DRM_6583.

Closing this issue as WORKSFORME.
Comment 20 CI Bug Log 2019-07-31 06:46:47 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.