Bug 107788 - [CI][DRMTIP] igt@gem_exec_await@wide-context - fail - __gem_context_create(fd, &ctx_id) == 0
Summary: [CI][DRMTIP] igt@gem_exec_await@wide-context - fail - __gem_context_create(fd...
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
Whiteboard: ReadyForDev
Depends on:
Reported: 2018-09-03 07:43 UTC by Martin Peres
Modified: 2018-10-14 15:21 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: GEM/Other


Description Martin Peres 2018-09-03 07:43:26 UTC

(gem_exec_await:1460) i915/gem_context-CRITICAL: Test assertion failure function gem_context_create, file ../lib/i915/gem_context.c:106:
(gem_exec_await:1460) i915/gem_context-CRITICAL: Failed assertion: __gem_context_create(fd, &ctx_id) == 0
(gem_exec_await:1460) i915/gem_context-CRITICAL: error: -28 != 0(ge
Comment 1 Mika Kuoppala 2018-09-03 14:23:22 UTC
Seems to be transient out of memory on context allocation.
Comment 2 Chris Wilson 2018-09-03 14:29:27 UTC
So close, but no we just ran out of contexts on icl (max number is 2048). ENOSPC here requires https://patchwork.freedesktop.org/series/44134/ which only works in this instance as the contexts are short lived. Other tests we've have to adjust to keep within the limit for long lived contexts.
Comment 3 Chris Wilson 2018-09-05 11:03:08 UTC
commit 288f1ced5e24abe3e768224f701a205c3a7e16f9 (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Sep 4 16:31:17 2018 +0100

    drm/i915: Reduce context HW ID lifetime
    Future gen reduce the number of bits we will have available to
    differentiate between contexts, so reduce the lifetime of the ID
    assignment from that of the context to its current active cycle (i.e.
    only while it is pinned for use by the HW, will it have a constant ID).
    This means that instead of a max of 2k allocated contexts (worst case
    before fun with bit twiddling), we instead have a limit of 2k in flight
    contexts (minus a few that have been pinned by the kernel or by perf).
    To reduce the number of contexts id we require, we allocate a context id
    on first and mark it as pinned for as long as the GEM context itself is,
    that is we keep it pinned it while active on each engine. If we exhaust
    our context id space, then we try to reclaim an id from an idle context.
    In the extreme case where all context ids are pinned by active contexts,
    we force the system to idle in order to recover ids.
    We cannot reduce the scope of an HW-ID to an engine (allowing the same
    gem_context to have different ids on each engine) as in the future we
    will need to preassign an id before we know which engine the
    context is being executed on.
    v2: Improved commentary (Tvrtko) [I tried at least]
    References: https://bugs.freedesktop.org/show_bug.cgi?id=107788
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Mika Kuoppala <mika.kuoppala@intel.com>
    Cc: Michel Thierry <michel.thierry@intel.com>
    Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
    Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180904153117.3907-1-chris@chris-wilson.co.uk
Comment 4 Lakshmi 2018-10-14 15:21:12 UTC
This issue occurred twice 1 month 1 week ago (31 rounds in between). Not seen this issue for more than 550 rounds. Closing this bug.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.