https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_110/fi-byt-clapper/igt@gem_exec_parse@basic-allocation.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_110/fi-hsw-peppy/igt@gem_exec_parse@basic-allocation.html Seems like the test is taking too long without outputting anything, which then gets killed by the new runner
These links are 404 by now, but when I look at https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_exec_parse@basic-allocation.html I see test consistently passing on HSW in around 45-70 seconds (I haven't checked all runs for execution time).
According to CI this is happening frequently. Latest log on HSW: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_214/fi-hsw-peppy/igt@gem_exec_parse@basic-allocation.html
Why is the test history page (the link I put in #1) all green then? I go to CI -> drm-tip -> shards all -> find the test in list and click on it to get it's history. What am I doing wrong there? Also, Chris seems to have noticed mailing list activity on this bug and has sent a proposed time cap for the test. So issue might get resolved quickly.
Mmm, good question. The machine fi-hsw-peppy doesn't appear in that page, not sure why. Martin?
(In reply to Francesco Balestrieri from comment #4) > Mmm, good question. The machine fi-hsw-peppy doesn't appear in that page, > not sure why. Martin? Only shard Machines are shown here https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_exec_parse@basic-allocation.html
*** Bug 105555 has been marked as a duplicate of this bug. ***
Hmm, I have commit b7120a04360ddbd8166657187599e2a0a3b1f12e Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Sep 15 17:03:22 2018 +0100 drm/i915: Recover batch pool caches from shrinker Discard all of our batch pools under mempressure to make their pages available to the shrinker. We will quickly reacquire them when necessary for more GPU relocations or for the command parser. v2: Init the lists for mock_engine v3: Return a strong ref from i915_gem_batch_pool_get() and convert it into an active reference to protect ourselves against all allocations while the object is in play. v4: Couple shadow batch to active request early. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107936 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> with an interesting tagline. Memory says that it was blowing up... But it will shrink the batch pool caches and so speed up basic-allocation.
But that is presupposing that it's mempressure; limiting my ivb to under 2G (like hsw-peppy) only doubles the runtime (kswapd is barely being invoked). The test predicts it needs 1G, and it doesn't seem far off. Fwiw, the patch does as it claims though, after patching under the same conditions the runtime is the same as when it has sufficient memory.
Test clamped to commit a382aeec489a187591677644cc3b98e34322b474 (HEAD, upstream/master) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Feb 11 14:29:29 2019 +0000 i915/gem_exec_parse: Switch to a fixed timeout for basic-allocations basic-allocations was written to demonstrate a flaw in our continual reallocation of cmdparser shadow bo, largely fixed by keeping a small cache of bo of different lengths (to speed up the search for the correct sized bo). We only care enough to exercise the slowdown by submitting lots of execbufs, and can see the effect of bo caching on the rate, so replace the fixed number of iterations with a timeout and count how many batches we could submit instead. Similarly, we now do not need to wait for all of our queue to complete as we can tell the kernel to drop the queue instead. References: https://bugs.freedesktop.org/show_bug.cgi?id=107936 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Must remember to finish off the shrinker patches.
(In reply to Chris Wilson from comment #9) > Test clamped to > > commit a382aeec489a187591677644cc3b98e34322b474 (HEAD, upstream/master) > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Mon Feb 11 14:29:29 2019 +0000 > > i915/gem_exec_parse: Switch to a fixed timeout for basic-allocations > > basic-allocations was written to demonstrate a flaw in our continual > reallocation of cmdparser shadow bo, largely fixed by keeping a small > cache of bo of different lengths (to speed up the search for the correct > sized bo). We only care enough to exercise the slowdown by submitting > lots of execbufs, and can see the effect of bo caching on the rate, so > replace the fixed number of iterations with a timeout and count how many > batches we could submit instead. > > Similarly, we now do not need to wait for all of our queue to complete > as we can tell the kernel to drop the queue instead. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=107936 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > > Must remember to finish off the shrinker patches. Still not fixing all the failure, as they pretty much fail every single drmtip run on the chromebooks: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_237/fi-hsw-peppy/igt@gem_exec_parse@basic-allocation.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_237/fi-byt-clapper/igt@gem_exec_parse@basic-allocation.html
A CI Bug Log filter associated to this bug has been updated: {- BYT HSW: igt@gem_exec_parse@basic-allocation - timeout -} {+ BYT HSW BSW: igt@gem_exec_parse@basic-allocation / igt@gem_exec_big - timeout +} No new failures caught with the new filter
commit abffc52b0ec74c8498f2197760199a54e29c8a6a (HEAD, upstream/master) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 12 18:40:37 2019 +0000 i915/gem_exec_big: Add a single shot test CI complains that the exhaustive test of trying every size up to the limit is too slow, so add a simple test that tries to submit one extreme batch buffer and check all the relocations land. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105555 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
*** Bug 110255 has been marked as a duplicate of this bug. ***
The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore.
Closing and archiving the issue, reproduction rate used to be 100% till Ddrmtip 247. But no new occurrences later.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.