Bug 108891

Summary: Test enumeration becomes funky on some tests due to fd FILE* stream change
Product: DRI Reporter: Casey Bowman <casey.g.bowman>
Component: IGTAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: casey.g.bowman
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Casey Bowman 2018-11-28 21:13:05 UTC
Testing on ChromeOS KBL device with a 4.4 kernel

IGT SHA used was:
dbe1fc1e13c4e4eebaa9e1bbc5d188b5109c09d3

Running the igt@gem_busy@basic-busy-default test resulted in the following failure:
	
(gem_busy:13397) igt-gt-CRITICAL: Test assertion failure function igt_allow_hang, file ../../intel-gpu-tools-9999/lib/igt_gt.c:174:
(gem_busy:13397) igt-gt-CRITICAL: Failed assertion: igt_sysfs_set_parameter (fd, "reset", "%d", INT_MAX )
(gem_busy:13397) igt-gt-CRITICAL: Last errno: 22, Invalid argument
Test gem_busy failed.
**** DEBUG ****
(gem_busy:13397) DEBUG: Test requirement passed: gem_has_ring(fd, e->exec_id | e->flags)
(gem_busy:13397) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_busy:13397) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd)
(gem_busy:13397) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: dir >= 0
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: err == 0
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
(gem_busy:13397) igt-dummyload-DEBUG: Test requirement passed: nengine
(gem_busy:13397) igt-core-INFO: Subtest basic-busy-default: SUCCESS (0.013s)
(gem_busy:13397) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd)
(gem_busy:13397) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: dir >= 0
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: err == 0
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
(gem_busy:13397) igt-dummyload-DEBUG: Test requirement passed: nengine
(gem_busy:13397) DEBUG: Test requirement passed: has_extended_busy_ioctl(fd)
(gem_busy:13397) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd)
(gem_busy:13397) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: dir >= 0
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: err == 0
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
(gem_busy:13397) igt-dummyload-DEBUG: Test requirement passed: nengine
(gem_busy:13397) DEBUG: Test requirement passed: has_extended_busy_ioctl(fd)
(gem_busy:13397) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd)
(gem_busy:13397) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: dir >= 0
(gem_busy:13397) ioctl-wrappers-DEBUG: Test requirement passed: err == 0
(gem_busy:13397) igt-gt-CRITICAL: Test assertion failure function igt_allow_hang, file ../../intel-gpu-tools-9999/lib/igt_gt.c:174:
(gem_busy:13397) igt-gt-CRITICAL: Failed assertion: igt_sysfs_set_parameter (fd, "reset", "%d", INT_MAX )
(gem_busy:13397) igt-gt-CRITICAL: Last errno: 22, Invalid argument
(gem_busy:13397) igt-core-INFO: Stack trace:
****  END  ****

The Stderr output clearly shows that the test passed with:
(gem_busy:13397) igt-core-INFO: Subtest basic-busy-default: SUCCESS (0.013s)

But the test continues on.

Doing a bisect resulted in the offending commit:

commit 3f89d7b02dcf662e994c7135b13d52bc8e09a4ea
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Aug 30 09:38:55 2018 +0100

    lib/sysfs: Avoid using FILE* temporary for igt_sysfs_[v]printf
    
    Currently we wrap our fd inside a FILE* stream to make use of vfprintf,
    but the man page leaves the question of errno and signal handling in
    doubt. It is documented as returning a negative value and setting
    ferror(), but we have been interpreting errno to handle signal
    restarting. As that is in doubt, reduce it to a sprintf and reuse our
    common interrupt handling write() that already returns -errno.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Katarzyna Dec <katarzyna.dec@intel.com>
    Reviewed-by: Katarzyna Dec <katarzyna.dec@intel.com>

Reverting the commit fixed the behavior and 30+ subtests on the fast-feedback.testlist changed back to passing.
Comment 1 Casey Bowman 2018-11-28 21:14:15 UTC
Note that the invalid argument error is most likely an unrelated error, but most likely attributed to this bug being found.
Comment 2 Chris Wilson 2018-11-28 21:53:11 UTC
Different subgroup, not affecting basic-busy-default, but rightfully preventing you from running any of the hang tests.
Comment 3 Chris Wilson 2018-11-28 22:19:25 UTC
diff --git a/lib/igt_core.c b/lib/igt_core.c
index e0989f53..bd600762 100644
--- a/lib/igt_core.c
+++ b/lib/igt_core.c
@@ -1116,30 +1116,28 @@ void igt_fail(int exitcode)
        if (in_atexit_handler)
                _exit(IGT_EXIT_FAILURE);
 
-       if (!failed_one)
-               igt_exitcode = exitcode;
-
-       failed_one = true;
-
        /* Silent exit, parent will do the yelling. */
        if (test_child)
                exit(exitcode);
 
+       if (in_fixture) {
+               skip_subtests_henceforth = FAIL;
+               __igt_fixture_end();
+       }
+
        _igt_log_buffer_dump();
 
+       if (!failed_one) {
+               igt_exitcode = exitcode;
+               failed_one = true;
+       }
+
        if (in_subtest) {
                if (exitcode == IGT_EXIT_TIMEOUT)
                        exit_subtest("TIMEOUT");
                else
                        exit_subtest("FAIL");
        } else {
-               assert(igt_can_fail());
-
-               if (in_fixture) {
-                       skip_subtests_henceforth = FAIL;
-                       __igt_fixture_end();
-               }
-
                igt_exit();
        }
 }
Comment 4 Chris Wilson 2018-11-28 22:19:47 UTC
Just a very misleading bisect.
Comment 5 Casey Bowman 2018-11-29 19:24:36 UTC
(In reply to Chris Wilson from comment #4)
> Just a very misleading bisect.

Agreed, the commit found would probably just cause the gem -hang tests to fail in the kernel we're using as you mentioned earlier.

However, for the cases in which a basic subtest is run (for example, igt@gem_busy@basic-busy-default or igt@gem_exec_reloc@basic-cpu) there shouldn't be any hang checks (igt_allow_hang) for the requirements of those subtests, right?

If so, should we change the bug title to reflect that some requirement checks are being run on subtests that don't reflect the subtest names, like how igt@gem_busy@basic-busy-default doesn't have 'hang' in it?
Comment 6 Martin Peres 2019-11-12 07:39:24 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/11.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.