Bug 103183

Summary: [CI] igt@perf@short-reads -fail - Failed assertion: ret == -1
Product: DRI Reporter: Marta Löfstedt <marta.lofstedt>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED MOVED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: minor    
Priority: medium CC: don.hiatt, intel-gfx-bugs, martin.peres
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: BXT, GLK, HSW, KBL, SKL i915 features: Perf/OA

Description Marta Löfstedt 2017-10-10 08:07:20 UTC
CI_DRM_3199 APL-shards igt@perf@short-reads

Fail:
(perf:2494) CRITICAL: Test assertion failure function test_short_reads, file perf.c:3125:
(perf:2494) CRITICAL: Failed assertion: ret == -1
(perf:2494) CRITICAL: error: 8 != -1
Subtest short-reads failed.

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3199/shard-apl5/igt@perf@short-reads.html
Comment 1 mwa 2017-10-18 21:50:07 UTC
8bytes is exactly the size of the report header, so probably just a case of being unlucky with the read() giving us a REPORT_LOST, which just so happens to fit into the non-protected region so we don't get an EFAULT.

First thought would just be:

diff --git a/tests/perf.c b/tests/perf.c
index ad62319e..fe8e32bf 100644
--- a/tests/perf.c
+++ b/tests/perf.c
@@ -3121,7 +3121,13 @@ test_short_reads(void)
        /* A read that can't return a single record because it would result
         * in a fault on buffer overrun should result in an EFAULT error...
         */
-       ret = read(stream_fd, pages + page_size - 16, page_size);
+       do {
+               header = (void *)(pages + page_size - 16);
+               ret = read(stream_fd,
+                          header,
+                          page_size);
+       } while (header->type == DRM_I915_PERF_RECORD_OA_REPORT_LOST);
+
        igt_assert_eq(ret, -1);
        igt_assert_eq(errno, EFAULT);
Comment 2 Marta Löfstedt 2017-11-17 09:18:05 UTC
Last seen:shard-kbl: CI_DRM_3236: 2017-10-14 / 114 runs ago
shard-apl: CI_DRM_3206: 2017-10-11 / 145 runs ago
Comment 3 Marta Löfstedt 2017-12-04 08:15:05 UTC
Reproduced GLK-shards:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3445/shard-glkb1/igt@perf@short-reads.html

(perf:12770) CRITICAL: Test assertion failure function test_short_reads, file perf.c:3125:
(perf:12770) CRITICAL: Failed assertion: ret == -1
(perf:12770) CRITICAL: error: 8 != -1
Subtest short-reads failed.
Comment 4 Elizabeth 2018-03-05 20:07:05 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/igt@perf@short-reads.html
All green in CI and no new reports since December, can this be closed? Thank you.
Comment 5 Marta Löfstedt 2018-03-06 06:25:36 UTC
(In reply to Elizabeth from comment #4)
> https://intel-gfx-ci.01.org/tree/drm-tip/igt@perf@short-reads.html
> All green in CI and no new reports since December, can this be closed? Thank
> you.

The cut-off time for a CI generated bug is 1 month. Unless there is a real fix merged.
Comment 6 Lionel Landwerlin 2018-03-14 17:36:37 UTC
Sure, feel free to close.
One kernel patch that might have fixed this is :

commit 41d3fdcd15d5ecf29cc73e8b79c2327ebb54b960
Author: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Date:   Thu Mar 1 11:06:13 2018 +0000

    drm/i915/perf: fix perf stream opening lock
Comment 7 Marta Löfstedt 2018-04-16 07:13:20 UTC
Re-opened due to SKL:

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_19/fi-skl-6260u/igt@perf@short-reads.html

(perf:1333) CRITICAL: Test assertion failure function test_short_reads, file ../tests/perf.c:2693:
(perf:1333) CRITICAL: Failed assertion: ret == -1
(perf:1333) CRITICAL: error: 8 != -1
Subtest short-reads failed.
Comment 8 Lakshmi 2018-09-04 09:36:06 UTC
This issue was seen last time two weeks ago. This issue occurs once in a week or two with drmtip/igt/CI DRM FULL. Do not close this issue until we get consistent results. Need to wait few weeks/months to close this issue.
Comment 9 Martin Peres 2018-09-07 07:37:25 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_105/fi-kbl-r/igt@perf@short-reads.html
	
(perf:1577) CRITICAL: Test assertion failure function test_short_reads, file ../tests/perf.c:2697:
(perf:1577) CRITICAL: Failed assertion: ret == -1
(perf:1577) CRITICAL: error: 8 != -1
Subtest short-reads failed.(pe
Comment 10 Martin Peres 2018-09-07 07:37:59 UTC
*** Bug 106917 has been marked as a duplicate of this bug. ***
Comment 11 Martin Peres 2018-12-20 15:17:01 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_157/fi-bxt-j4205/igt@perf@short-reads.html

Starting subtest: short-reads
(perf:2171) CRITICAL: Test assertion failure function test_short_reads, file ../tests/perf.c:2704:
(perf:2171) CRITICAL: Failed assertion: ret == -1
(perf:2171) CRITICAL: error: 8 != -1
Subtest short-reads failed.
Comment 12 Don Hiatt 2019-09-03 16:45:07 UTC
The test_short_reads() "tests/perf.c" is trying to read the OA perf records. The assert is triggering because it is requesting a read that can't return a single record and is expecting an EFAULT. However, instead of getting the EFAULT, the read is failing with -8(ENOEXEC).

In Comment #1, mwa suggested a fix that has still not been added to igt.

The failure is still occurring on KBL(CI_DRM_6743_full 1 week, 6 days old) / SKL(drmtip_356 2 days, 3 hours old).

As this is a test harness issue, it doesn't seem to have any user impact so lowering the current high/critical to medium/minor. The fix in #1 should be given a try.
Comment 13 CI Bug Log 2019-10-18 08:01:04 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SKL BXT APL KBL GLK: igt@perf@short-reads - Failed assertion: ret == -1 -}
{+ SKL BXT APL KBL GLK: igt@perf@short-reads - fail / timeout - Failed assertion: ret == -1 +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7092/shard-apl5/igt@perf@short-reads.html
Comment 14 Don Hiatt 2019-10-18 17:17:21 UTC
Submitted mwa's suggested fix (Comment #1) to try-bot.
Comment 15 CI Bug Log 2019-11-04 12:34:37 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SKL BXT APL KBL GLK: igt@perf@short-reads - fail / timeout - Failed assertion: ret == -1 -}
{+ HSW BXT APL SKL KBL GLK: igt@perf@short-reads - fail / timeout - Failed assertion: ret == -1 +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7238/shard-hsw6/igt@perf@short-reads.html
Comment 16 Martin Peres 2019-11-29 17:28:16 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/51.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.