Bug 110618 - [SKL,GLK]gem_persistent_relocs@forked-interruptible-thrashing hangs
Summary: [SKL,GLK]gem_persistent_relocs@forked-interruptible-thrashing hangs
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 110646 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-05-06 09:33 UTC by Imre Deak
Modified: 2019-05-08 13:46 UTC (History)
2 users (show)

See Also:
i915 platform: BDW, GLK, SKL
i915 features: GEM/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 CI Bug Log 2019-05-07 06:41:26 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* SKL GLK: igt@gem_persistent_relocs@forked-interruptible-thrashing -timeout - Received signal SIGQUIT
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4242/shard-skl9/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk7/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
Comment 2 Chris Wilson 2019-05-07 16:44:39 UTC
My guess for the symptom here is the uninterruptible wait for flush_work in i915_drop_caches_set(DROP_IDLE). Exactly why it wasn't able to idle, unknown as we have no debug logs. However, I've been poking at the same code to avoid the uninterruptible lockup, and so present

commit 3970564940ba0322bcefce7fd8fd35c2b85846bf
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue May 7 13:11:08 2019 +0100

    drm/i915: Stop spinning for DROP_IDLE (debugfs/i915_drop_caches)
    
    If the user is racing a call to debugfs/i915_drop_caches with ongoing
    submission from another thread/process, we may never end up idling the
    GPU and be uninterruptibly spinning in debugfs/i915_drop_caches trying
    to catch an idle moment.
    
    Just flush the work once, that should be enough to park the system under
    correct conditions. Outside of those we either have a driver bug or the
    user is racing themselves. Sadly, because the user may be provoking the
    unwanted situation we can't put a warn here to attract attention to a
    probable bug.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190507121108.18377-4-chris@chris-wilson.co.uk
Comment 3 Chris Wilson 2019-05-08 13:38:54 UTC
*** Bug 110646 has been marked as a duplicate of this bug. ***
Comment 4 CI Bug Log 2019-05-08 13:46:04 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SKL GLK: igt@gem_persistent_relocs@forked-interruptible-thrashing -timeout - Received signal SIGQUIT -}
{+ BDW SKL GLK: igt@gem_persistent_relocs@forked-*-thrashing -timeout - Received signal SIGQUIT +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_279/fi-bdw-gvtdvm/igt@gem_persistent_relocs@forked-thrashing.html


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.