Bug 110255 - [CI][SHARDS] igt@gem_exec_nop@basic-sequential - incomplete - No logs
Summary: [CI][SHARDS] igt@gem_exec_nop@basic-sequential - incomplete - No logs
Status: RESOLVED DUPLICATE of bug 107936
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: highest normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-27 09:18 UTC by Martin Peres
Modified: 2019-06-03 14:56 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: GEM/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-03-27 09:18:48 UTC
And no HTML files either...
Comment 2 Daniel Vetter 2019-04-01 13:57:59 UTC
Possibly caused by

commit 2631c4e31e58e2667cd03573d77f694ef704fde6
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Jan 17 12:17:42 2019 +0000

    i915/gem_exec_nop: poll-sequential requires ordering between rings

seems to be fixed again since

commit a350b9f9f606296b1599c3617c8530a8985709e2
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Mar 26 13:26:26 2019 +0000

    tools/intel-gpu-top: Only link against igt_perf.la

but that doesn't make a lot of sense.
Comment 3 Martin Peres 2019-04-01 13:59:50 UTC
(In reply to Daniel Vetter from comment #2)
> Possibly caused by
> 
> commit 2631c4e31e58e2667cd03573d77f694ef704fde6
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Thu Jan 17 12:17:42 2019 +0000
> 
>     i915/gem_exec_nop: poll-sequential requires ordering between rings
> 
> seems to be fixed again since
> 
> commit a350b9f9f606296b1599c3617c8530a8985709e2
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Mar 26 13:26:26 2019 +0000
> 
>     tools/intel-gpu-top: Only link against igt_perf.la
> 
> but that doesn't make a lot of sense.

Bumping to highest because of the customer impact, even though this seems to be fixed.

GEM team, please review and see if this makes any sense. If it does not, then keep it open and hope for better logs!
Comment 4 Chris Wilson 2019-04-01 14:20:41 UTC
It's icl. What's new and surprising here?
Comment 5 Francesco Balestrieri 2019-04-02 05:05:45 UTC
I may be totally mistaken here, but I can see that for every occurrence, in the same run, OOM is reported due to gem_exec_big. Is it possible that this is causing the incomplete on this test? If that's the case, the disappearance of the issue would be explained by this patch:

https://lists.freedesktop.org/archives/igt-dev/2019-March/011173.html

[PATCH i-g-t] i915/gem_exec_big: 128MiB not enough slack? Let out the rope!

Even with 128MiB reserved for other use, a single pass of gem_exec_big
runs out of memory. Give in and halve our batch size, that has to be
enough slack! As to why it keeps on failing, is left as an exercise to
the reader -- we have to solve the mm/ mystery one day, as eventually it
will be our only remaining source of bugs!

Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
---
 tests/i915/gem_exec_big.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/i915/gem_exec_big.c b/tests/i915/gem_exec_big.c
index 015f59e29..440136ee8 100644
--- a/tests/i915/gem_exec_big.c
+++ b/tests/i915/gem_exec_big.c
@@ -260,7 +260,7 @@ static void single(int i915)
 	uint32_t handle;
 	void *ptr;
 
-	batch_size = (intel_get_avail_ram_mb() - 128) << 20; /* CI slack */
+	batch_size = (intel_get_avail_ram_mb() / 2) << 20; /* XXX CI slack? */
 	limit = gem_aperture_size(i915) - (256 << 10); /* low pages reserved */
 	if (!gem_uses_full_ppgtt(i915))
 		limit = 3 * limit / 4;
-- 
2.20.1
Comment 6 Francesco Balestrieri 2019-04-02 11:34:56 UTC
From the logs:

https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4904/shard-iclb8/dmesg24.log

gem_exec_big triggered the OOM killer which killed Java and IGT runner. 

I'm resolving this as duplicate of Bug 107936, I'll file a separate bug to CI about the misreported test.

*** This bug has been marked as a duplicate of bug 107936 ***
Comment 7 Francesco Balestrieri 2019-04-02 11:40:31 UTC
Filed Bug 110306 for the misreporting issue.
Comment 8 Jani Saarinen 2019-06-03 14:56:10 UTC
This now also seen on https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6179/re-icl-u/igt@gem_exec_nop@basic-sequential.html. Is this same issue or new one?


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.