Bug 111035 - [CI][SHARDS] igt@gem_create@create-clear - dmesg-fail - gem_create: page allocation failure
Summary: [CI][SHARDS] igt@gem_create@create-clear - dmesg-fail - gem_create: page allo...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-01 10:57 UTC by Martin Peres
Modified: 2019-07-03 19:51 UTC (History)
1 user (show)

See Also:
i915 platform: SNB
i915 features: GEM/Other


Attachments

Description Martin Peres 2019-07-01 10:57:56 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5071/shard-snb1/igt@gem_create@create-clear.html

<6> [923.583591] [IGT] gem_create: executing
<6> [923.590047] [IGT] gem_create: starting subtest create-clear
<4> [927.627455] gem_create: page allocation failure: order:0, mode:0x104cd2(GFP_HIGHUSER|__GFP_RETRY_MAYFAIL|__GFP_RECLAIMABLE), nodemask=(null)
<4> [927.627472] CPU: 6 PID: 7843 Comm: gem_create Tainted: G     U            5.2.0-rc6-CI-CI_DRM_6375+ #1
<4> [927.627474] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
<4> [927.627475] Call Trace:
<4> [927.627481]  dump_stack+0x67/0x9b
<4> [927.627485]  warn_alloc+0xfa/0x180
<4> [927.627490]  ? __mutex_unlock_slowpath+0x46/0x2b0
<4> [927.627495]  __alloc_pages_nodemask+0xd5c/0x1130
<4> [927.627506]  shmem_alloc_and_acct_page+0x72/0x1e0
<4> [927.627510]  shmem_getpage_gfp.isra.8+0x156/0x860
<4> [927.627517]  shmem_read_mapping_page_gfp+0x3e/0x70
<4> [927.627573]  shmem_get_pages+0x212/0x6d0 [i915]
<4> [927.627608]  ? __i915_gem_object_get_pages+0x18/0xb0 [i915]
<4> [927.627614]  ? lock_acquire+0xa6/0x1c0
<4> [927.627647]  ____i915_gem_object_get_pages+0x1d/0xa0 [i915]
<4> [927.627677]  __i915_gem_object_get_pages+0x59/0xb0 [i915]
<4> [927.627710]  i915_gem_pread_ioctl+0x3ea/0x7d0 [i915]
<4> [927.627713]  ? drm_dev_exit+0x8/0x40
<4> [927.627746]  ? i915_gem_gtt_pread+0x7a0/0x7a0 [i915]
<4> [927.627749]  drm_ioctl_kernel+0x83/0xf0
<4> [927.627753]  drm_ioctl+0x2f3/0x3b0
<4> [927.627786]  ? i915_gem_gtt_pread+0x7a0/0x7a0 [i915]
<4> [927.627792]  ? lock_acquire+0xa6/0x1c0
<4> [927.627796]  do_vfs_ioctl+0xa0/0x6e0
<4> [927.627800]  ? __fget+0x10f/0x200
<4> [927.627803]  ksys_ioctl+0x35/0x60
<4> [927.627807]  __x64_sys_ioctl+0x11/0x20
<4> [927.627809]  do_syscall_64+0x55/0x1c0
<4> [927.627812]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [927.627814] RIP: 0033:0x7f8acc5305d7
<4> [927.627818] Code: Bad RIP value.
Comment 1 CI Bug Log 2019-07-01 10:58:22 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* SNB: igt@gem_create@create-clear - dmesg-fail - gem_create: page allocation failure
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5071/shard-snb1/igt@gem_create@create-clear.html
Comment 2 Chris Wilson 2019-07-01 11:30:44 UTC
That's the impact of making the page reclaim unavailable while it is pending the rcu callback. Should be fine to rearrange like https://patchwork.freedesktop.org/patch/315092/?series=63032&rev=1
Comment 3 Chris Wilson 2019-07-03 19:51:13 UTC
commit c03467ba40f783ebe756114bb68e13a6b404c03a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jul 3 10:17:17 2019 +0100

    drm/i915/gem: Free pages before rcu-freeing the object
    
    As we have dropped the final reference to the object, we do not need to
    wait until after the rcu grace period to drop its pages. We still require
    struct_mutex to completely unbind the object to release the pages, so we
    still need a free-worker to manage that from process context. By
    scheduling the release of pages before waiting for the rcu should mean
    that we are not trapping those pages from beyond the reach of the
    shrinker.
    
    v2: Pass along the request to skip if the vma is busy to the underlying
    unbind routine, to avoid checking the reservation underneath the
    i915->mm.obj_lock which may be used from inside irq context.
    
    v3: Flip the bit for unbinding while active, for later convenience.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111035
    Fixes: a93615f900bd ("drm/i915: Throw away the active object retirement complexity")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Matthew Auld <matthew.auld@intel.com>
    Reviewed-by: Matthew Auld <matthew.auld@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190703091726.11690-6-chris@chris-wilson.co.uk


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.