Summary: | [CI][DRMTIP] igt@kms_atomic_transition@plane-all-modeset-transition* - fail - Last errno: 28, No space left on device | ||
---|---|---|---|
Product: | DRI | Reporter: | Martin Peres <martin.peres> |
Component: | IGT | Assignee: | Stanislav Lisovskiy <stanislav.lisovskiy> |
Status: | CLOSED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | intel-gfx-bugs |
Version: | XOrg git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | HSW, ICL, KBL, SKL | i915 features: | display/Other |
Description
Martin Peres
2019-01-04 14:11:59 UTC
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * SKL: igt@kms_atomic_transition@plane-all-modeset-transition* - fail - Last errno: 28, No space left on device - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_11112/shard-kbl7/igt@kms_atomic_transition@plane-all-transition-nonblocking.html - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_174/fi-skl-6700k2/igt@kms_atomic_transition@plane-all-transition-nonblocking.html A CI Bug Log filter associated to this bug has been updated: {- SKL: igt@kms_atomic_transition@plane-all-modeset-transition* - fail - Last errno: 28, No space left on device -} {+ SKL: igt@kms_atomic_transition@plane-all-modeset-transition* - fail - Last errno: 28, No space left on device +} New failures caught by the filter: * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_177/fi-skl-6700k2/igt@kms_atomic_transition@plane-use-after-nonblocking-unbind-fencing.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_191/fi-skl-6700k2/igt@kms_atomic_transition@plane-all-modeset-transition-fencing.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_194/fi-skl-6700k2/igt@kms_atomic_transition@plane-all-modeset-transition-fencing.html After some struggles with kms_atomic_transition I've got a feeling I know what it can be related to. All the recent issues seem to be either skips or dmesg-warn due to FIFO underrun: http://gfx-ci.fi.intel.com/cibuglog-ng/issue/1015/history For example: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_209/fi-icl-u3/igt@kms_atomic_transition@plane-all-modeset-transition-fencing.html Looks like it is a wrong filter or something as for example in most of those, there were no -ENOSPC error: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_209/fi-icl-u2/igt@kms_atomic_transition@plane-all-transition-nonblocking.html The actual reason for the "No space left on device" issue is in the IGT itself. I've noticed there are "EDID invalid" messaged in attached dmesg, always when issue happens: 915 0000:00:02.0: DP-2: EDID is invalid: <4>[ 46.742040] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <4>[ 46.742041] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <4>[ 46.742043] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <4>[ 46.742044] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <4>[ 46.742045] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <4>[ 46.742046] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <4>[ 46.742047] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <4>[ 46.742049] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <7>[ 46.742399] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:107:DP-2] probed modes : <7>[ 46.742403] [drm:drm_mode_debug_printmodeline] Modeline 165:"1024x768" 60 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa This causes drm to fallback to fixed mode 1024x768, is it could not read displat EDID(see intel_dp.c and drm_edid.c). This resolution is lower than typical resolution used in the test case. Then there is a problem that run_transition_test case doesn't clean up plane size property which it sets, during the test run. In next test case we create a framebuffer which is of size of current output mode->hdisplay, mode->vdisplay. Usually it is of same size, however due to this EDID it happens to be smaller(1024x768) which causes that everytime proper cleanup wasn't done after previous run_transition_test call, we are then attempting to set plane size bigger than the framebuffer size, which causes -ENOSPC to be returned from drm_atomic_plane_check. To fix that we need to cleanup all the plane properties associated to this output before we proceed with the next test case, otherwise IGT seems to commit those in the first commit when we associate output with pipe. I've simulated this by either making drm intentionally return wrong EDID, so that fixed mode is used or by simply decreasing mode->hdisplay/vdisplay second time the function is called => this always results in ENOSPC. Simple cure is just add igt_plane_set_size in the cleanup, so that plane size is reduced to 0 or disabled. So this issue is actually an IGT issue. Another problem is "Invalid EDID" being read from display, however drm seems to act here as expected. A CI Bug Log filter associated to this bug has been updated: {- SKL: igt@kms_atomic_transition@plane-all-modeset-transition* - fail - Last errno: 28, No space left on device -} {+ All machines: igt@kms_atomic_transition@plane-all-modeset-transition* - fail - Failed assertion: ret == 0, Last errno: 28, No space left on device +} No new failures caught with the new filter A CI Bug Log filter associated to this bug has been updated: {- All machines: igt@kms_atomic_transition@plane-all-modeset-transition* - fail - Failed assertion: ret == 0, Last errno: 28, No space left on device -} {+ All machines: igt@kms_atomic_transition@plane-all-modeset-transition* - fail - Failed assertion: ret == 0, Last errno: 28, No space left on device +} No new failures caught with the new filter (In reply to Stanislav Lisovskiy from comment #4) > All the recent issues seem to be either skips or dmesg-warn due to FIFO > underrun: > > http://gfx-ci.fi.intel.com/cibuglog-ng/issue/1015/history > > For example: > > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_209/fi-icl-u3/ > igt@kms_atomic_transition@plane-all-modeset-transition-fencing.html > > Looks like it is a wrong filter or something as for example in most of > those, there were no -ENOSPC error: > > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_209/fi-icl-u2/ > igt@kms_atomic_transition@plane-all-transition-nonblocking.html Sorry, the filter was definitely not filed properly... It was matching every failure happening on any of the igt@kms_atomic_transition@plane-all-modeset-transition* tests... Sorry about that! commit 91908d36d0d5c90eea86e29736d2748d5ec55335 Author: Stanislav Lisovskiy <stanislav.lisovskiy@gmail.com> Date: Tue Feb 19 11:38:00 2019 +0200 igt/tests: Fix error checking in kms_atomic_transition (In reply to Petri Latvala from comment #9) > commit 91908d36d0d5c90eea86e29736d2748d5ec55335 > Author: Stanislav Lisovskiy <stanislav.lisovskiy@gmail.com> > Date: Tue Feb 19 11:38:00 2019 +0200 > > igt/tests: Fix error checking in kms_atomic_transition Still happening pretty much every single drmtip run: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_236/fi-skl-6700k2/igt@kms_atomic_transition@plane-all-transition-nonblocking.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_236/fi-hsw-4770r/igt@kms_atomic_transition@plane-all-transition-nonblocking.html This happens only on machines which have this "i915 0000:00:02.0: HDMI-A-1: EDID is invalid:" message, which makes it fallback to 1024x768 mode(which means fb smaller than usual). Probably there is still a bug in the tests, with the same root cause(-ENOSPC is returned only in case when plane size happens to be more than fb). (In reply to Stanislav Lisovskiy from comment #11) > This happens only on machines which have this "i915 0000:00:02.0: HDMI-A-1: > EDID is invalid:" message, which makes it fallback to 1024x768 mode(which > means fb smaller than usual). Probably there is still a bug in the tests, > with the same root cause(-ENOSPC is returned only in case when plane size > happens to be more than fb). (In reply to Martin Peres from comment #10) > (In reply to Petri Latvala from comment #9) > > commit 91908d36d0d5c90eea86e29736d2748d5ec55335 > > Author: Stanislav Lisovskiy <stanislav.lisovskiy@gmail.com> > > Date: Tue Feb 19 11:38:00 2019 +0200 > > > > igt/tests: Fix error checking in kms_atomic_transition > > Still happening pretty much every single drmtip run: > > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_236/fi-skl-6700k2/ > igt@kms_atomic_transition@plane-all-transition-nonblocking.html > > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_236/fi-hsw-4770r/ > igt@kms_atomic_transition@plane-all-transition-nonblocking.html Looks like this was without that change still, I checked the latest runs are not failing on this machine and the line number where commit fails, doesn't seem to (In reply to Martin Peres from comment #10) > (In reply to Petri Latvala from comment #9) > > commit 91908d36d0d5c90eea86e29736d2748d5ec55335 > > Author: Stanislav Lisovskiy <stanislav.lisovskiy@gmail.com> > > Date: Tue Feb 19 11:38:00 2019 +0200 > > > > igt/tests: Fix error checking in kms_atomic_transition > > Still happening pretty much every single drmtip run: > > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_236/fi-skl-6700k2/ > igt@kms_atomic_transition@plane-all-transition-nonblocking.html > > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_236/fi-hsw-4770r/ > igt@kms_atomic_transition@plane-all-transition-nonblocking.html I think those are not happening anymore. Also in those crashes, it looks like my IGT patch was not applied, because the line number 501(igt_display_commit2 which fails with ENOSPC) in the stack trace, corresponds to older code version, i.e it can't be 501 with the latest change(line 501 with last commit corresponds to completely different code). Ping. See above message once again - proposing to close this bug. This issue used to appear in every drmtip run. Last seen drmtip_236 (1 week, 6 days / 187 runs ago). Lets wait for 2 more runs of drmtip and close if no failures are seen. I think it stopped appearing right after my patch, in the link which Martin posted stack trace is still with old igt code(with my changes, there is no igt_display_commit at line 501). So I'm afraid, there was no reason to reopen it :D (In reply to Stanislav Lisovskiy from comment #15) > I think it stopped appearing right after my patch, in the link which Martin > posted stack trace is still with old igt code(with my changes, there is no > igt_display_commit at line 501). So I'm afraid, there was no reason to > reopen it :D Here is the patch from Stan: commit 91908d36d0d5c90eea86e29736d2748d5ec55335 Author: Stanislav Lisovskiy <stanislav.lisovskiy@gmail.com> AuthorDate: Tue Feb 19 11:38:00 2019 +0200 Commit: Petri Latvala <petri.latvala@intel.com> CommitDate: Wed Mar 6 14:53:51 2019 +0200 igt/tests: Fix error checking in kms_atomic_transition There is no guarantee that error return value will be always EINVAL, made a check more general as it can be ERANGE, ENOSPC, EINVAL and probably others, which all mean the same in context of this test case: i.e this sprite size is not valid. v2: Added macro to make check look a bit nicer. v3: Removed redundant debug line. v4: Added assertion if error is not EINVAL as expected, other errors except EINVAL are considered now a failures. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109225 Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> And the last failure was: commit 27027cf078e5e8c4ced3b7d941890659e4adf1cd Author: Nischala Yelchuri <nischala.yelchuri@intel.com> AuthorDate: Fri Mar 1 11:49:00 2019 -0800 Commit: Chris Wilson <chris@chris-wilson.co.uk> CommitDate: Sat Mar 2 20:25:43 2019 +0000 tests/kms_cursor_legacy: Add missing munmap Added munmap and replaced hard-coded values with PAGE_SIZE macro. Cc: Easwar Hariharan <easwar.hariharan@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Nischala Yelchuri <nischala.yelchuri@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Given that we had a 100% reproduction rate on drmtip and that the last failure was seen 10 runs ago (drmtip_236 and we are now at drmtip_246), I think it is safe to close it again! Sorry for the noise, Stan! The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.