Bug 102242

Summary: [CI] assert(level < 128) failed for igt@pm_rpm@sysfs-read
Product: DRI Reporter: Martin Peres <martin.peres>
Component: IGTAssignee: Default DRI bug account <dri-devel>
Status: CLOSED FIXED QA Contact:
Severity: critical    
Priority: high CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: HSW i915 features: power/Other

Description Martin Peres 2017-08-16 09:17:52 UTC
The test igt@pm_rpm@sysfs-read fails the following assert when running on haswell:

(pm_rpm:1605) CRITICAL: Test assertion failure function read_files_from_dir, file pm_rpm.c:856:
(pm_rpm:1605) CRITICAL: Failed assertion: level < 128
(pm_rpm:1605) CRITICAL: error: 128 >= 128

Full logs: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_2968/shard-hsw2/igt@pm_rpm@sysfs-read.html
Comment 1 Chris Wilson 2017-08-16 09:30:09 UTC
It can be as simple as

diff --git a/tests/pm_rpm.c b/tests/pm_rpm.c
index 47c9f114..f59f71d0 100644
--- a/tests/pm_rpm.c
+++ b/tests/pm_rpm.c
@@ -850,11 +850,12 @@ static void read_files_from_dir(int path, int level)
        struct dirent *dirent;
        int rc;
 
+       if (level >= 128)
+               return; /* enough! we could chose to detect the recursion... */
+
        dir = fdopendir(path);
        igt_assert(dir);
 
-       igt_assert_lt(level, 128);
-
        while ((dirent = readdir(dir))) {
                struct stat stat_buf;
                int de;
Comment 2 Chris Wilson 2017-08-18 15:09:48 UTC
Or we can try: man 3 nftw.
Comment 3 Jari Tahvanainen 2017-08-22 12:50:46 UTC
This failure is visible in SKL too.

I did some bisecting related to this and found out that change from pass to fail was actually caused by igt commit ced87bd9.
commit ced87bd913bfbfb8ecbe6352d87d133e5e4c81ff
Author:     Chris Wilson <chris@chris-wilson.co.uk>
AuthorDate: Sat Apr 8 13:15:18 2017 +0100
Commit:     Chris Wilson <chris@chris-wilson.co.uk>
CommitDate: Sat Apr 8 13:47:53 2017 +0100
    igt/pm_rpm: Use directory fd to track and read entire directories
    Rather than compute the temporary full path name, remember it via the dir
    fd we already have.
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Here is the actual outputs for this particular test on the latest kernel 
    drm-tip: 2017y-08m-21d-08h-13m-34s UTC integration manifest

dev-skl-i5-6600k:/opt/source/intel-gpu-tools/tests$ sudo ./pm_rpm --r sysfs-read
IGT-Version: 1.18-gced87bd9 (x86_64) (Linux: 4.13.0-rc5-ezbench_dbfb2f62576e+ x86_64)
Runtime PM support: 1
PC8 residency support: 0
(pm_rpm:15426) CRITICAL: Test assertion failure function read_files_from_dir, file pm_rpm.c:856:
(pm_rpm:15426) CRITICAL: Failed assertion: level < 128
(pm_rpm:15426) CRITICAL: Last errno: 5, Input/output error
(pm_rpm:15426) CRITICAL: error: 128 >= 128
Stack trace:
  #0 [__igt_fail_assert+0x101]
  #1 [read_files_from_dir+0x222]
  #2 [<unknown>+0x222]
  #3 [<unknown>+0x222]
Subtest sysfs-read failed.
**** DEBUG ****
(pm_rpm:15426) DEBUG: Test requirement passed: dir != -1
(pm_rpm:15426) CRITICAL: Test assertion failure function read_files_from_dir, file pm_rpm.c:856:
(pm_rpm:15426) CRITICAL: Failed assertion: level < 128
(pm_rpm:15426) CRITICAL: Last errno: 5, Input/output error
(pm_rpm:15426) CRITICAL: error: 128 >= 128
****  END  ****
Subtest sysfs-read: FAIL (0,301s)

testrunner@dev-skl-i5-6600k:/opt/source/intel-gpu-tools/tests$ sudo ./pm_rpm --r sysfs-read
IGT-Version: 1.18-gc3268d9a (x86_64) (Linux: 4.13.0-rc5-ezbench_dbfb2f62576e+ x86_64)
Runtime PM support: 1
PC8 residency support: 0
Subtest sysfs-read: SUCCESS (0,402s)
Comment 4 Jari Tahvanainen 2017-08-28 12:36:22 UTC
From bug 100717: "https://patchwork.freedesktop.org/series/29141/ in particular note the improvements for both sysfs-read and debufs-read in https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_90/shards.html"
Comment 5 Martin Peres 2017-08-28 13:05:13 UTC
(In reply to Jari Tahvanainen from comment #4)
> From bug 100717: "https://patchwork.freedesktop.org/series/29141/ in
> particular note the improvements for both sysfs-read and debufs-read in
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_90/shards.html"

Yes, this seems like a good patch. It still is waiting for review though...
Comment 6 Jari Tahvanainen 2017-09-01 11:05:22 UTC
Tested-by/SKL: I see the same improvement (FAIL -> SUCCESS) with igt+series (https://patchwork.freedesktop.org/series/29141/) used on latest drm-tip kernel drm-tip: "2017y-08m-30d-08h-12m-34s UTC integration manifest".
Looping ./pm_rpm --r sysfs-read thousand times resulted all
Subtest sysfs-read: SUCCESS (0,1..s).

Moving to IGT component since fixing is done there...
Comment 7 Chris Wilson 2017-09-06 18:07:18 UTC
commit e56ab79711b3fb248bf165d1601acd25a2b7529d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Aug 22 13:47:33 2017 +0100

    igt/pm_rpm: Use libc 'ftw' rather than opencoding our own filetree walk
    
    By using ftw, we avoid the issue of having to handle directory recursion
    ourselves and can focus on the test of checking the reading a
    sysfs/debugfs does not break runtime suspend. In the process, disregard
    errors when opening the individual files as they may fail for other
    reasons.
    
    v2: Bracket the file open/close with the wait_for_suspended() tests.
    Whilst the fd is open, it may be keeping the device awake, e.g.
    i915_forcewake_user.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Tested-by: Jari Tahvanainen <jari.tahvanainen@intel.com>
Comment 8 Martin Peres 2017-09-07 07:48:38 UTC
Verified fixed! Thanks Chris!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.