Bug 112037 - [CI] igt@gem_persistent_relocs@forked[-interruptible]-thrashing - fail - Failed assertion: test == 0xdeadbeef
Summary: [CI] igt@gem_persistent_relocs@forked[-interruptible]-thrashing - fail - Fai...
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: high major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-16 15:07 UTC by Matt Roper
Modified: 2019-11-29 19:41 UTC (History)
1 user (show)

See Also:
i915 platform: ALL
i915 features: GEM/Other


Attachments

Description Matt Roper 2019-10-16 15:07:14 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14822/shard-iclb5/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14820/shard-hsw8/igt@gem_persistent_relocs@forked-thrashing.html
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14814/shard-snb4/igt@gem_persistent_relocs@forked-thrashing.html

All have failures of the form:

(gem_persistent_relocs:2554) CRITICAL: Test assertion failure function do_test, file ../tests/i915/gem_persistent_relocs.c:256:
(gem_persistent_relocs:2554) CRITICAL: Failed assertion: test == 0xdeadbeef
(gem_persistent_relocs:2554) CRITICAL: mismatch in buffer 11: 0x00000000 instead of 0xdeadbeef at offset 192

with nothing of interest in dmesg.
Comment 1 Matt Roper 2019-10-16 15:12:30 UTC
I'm not sure if it's related, but CI also flagged another run of the same test here:

https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14821/shard-apl3/igt@gem_persistent_relocs@forked-thrashing.html

In that case the test itself passed successfully, but the kernel reported bad lock ordering; that may be a clue as to what's causing the other failures.
Comment 2 Chris Wilson 2019-10-16 18:46:58 UTC
(In reply to Matt Roper from comment #1)
> I'm not sure if it's related, but CI also flagged another run of the same
> test here:
> 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14821/shard-apl3/
> igt@gem_persistent_relocs@forked-thrashing.html
> 
> In that case the test itself passed successfully, but the kernel reported
> bad lock ordering; that may be a clue as to what's causing the other
> failures.

Nothing to do with it. That's a leftover from perf.
Comment 3 Chris Wilson 2019-10-17 15:36:07 UTC
commit bb5735423eaf6fdbf6b2f94ef0b8520e74eab993 (HEAD, upstream/master)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Oct 16 19:54:39 2019 +0100

    i915/gem_persistent_relocs: Manage the domain for the GGTT access
    
    Since the GGTT fault will overlap with the pwrite access, there is no
    implicit moment at which the kernel will automagically flush the backing
    store. Userspace has to be explicit in its domain control, or do itself.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112037
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

should do the trick.
Comment 4 CI Bug Log 2019-10-18 06:29:17 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* SNB HSW KBL ICL: igt@gem_persistent_relocs@forked(-interruptible)?-thrashing* - fail -  Failed assertion: test == 0xdeadbeef,  mismatch in buffer \d+
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7097/shard-iclb8/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7097/shard-snb6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7098/shard-kbl4/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14814/shard-snb4/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14820/shard-hsw6/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14820/shard-hsw8/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14822/shard-iclb5/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14825/shard-snb1/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7103/shard-kbl3/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14827/shard-hsw7/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7105/shard-hsw6/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7105/shard-snb1/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14828/shard-hsw7/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14828/shard-snb7/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3578/shard-hsw7/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14832/shard-snb4/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7107/shard-hsw6/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7110/shard-hsw4/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7111/shard-hsw6/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7111/shard-snb2/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7117/shard-kbl6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7120/shard-hsw8/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5232/shard-hsw1/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14857/shard-snb7/igt@gem_persistent_relocs@forked-thrashing.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14860/shard-iclb3/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
Comment 6 Lakshmi 2019-10-18 07:54:11 UTC
(In reply to CI Bug Log from comment #5)
> The CI Bug Log issue associated to this bug has been updated.
> 
> ### New filters associated
> 
> * SNB HSW ICL: igt@gem_persistent@relocs@forked-* - timeout -
> GEM_BUG_ON(i915_vma_is_active(vma))
>   -
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7087/shard-snb7/
> igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-
> inactive.html
>   -
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7095/shard-hsw4/
> igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-
> inactive.html
>   -
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7096/shard-snb2/
> igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html
>   -
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7110/shard-hsw6/
> igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-
> inactive.html
>   -
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7115/shard-iclb3/
> igt@gem_persistent_relocs@forked-thrashing.html
>   -
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7116/shard-iclb8/
> igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
>   -
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7120/shard-iclb2/
> igt@gem_persistent_relocs@forked-thrashing.html

@chris, are these failures should be filed separately? At the moment these are part of this bug.
Comment 7 CI Bug Log 2019-10-18 17:51:11 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SNB HSW ICL: igt@gem_persistent@relocs@forked-* - timeout - GEM_BUG_ON(i915_vma_is_active(vma)) -}
{+ SNB HSW ICL: igt@gem_persistent_relocs@forked-* - timeout - GEM_BUG_ON(i915_vma_is_active(vma)) +}


  No new failures caught with the new filter
Comment 8 CI Bug Log 2019-10-18 17:55:31 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SNB HSW ICL: igt@gem_persistent_relocs@forked-* - timeout - GEM_BUG_ON(i915_vma_is_active(vma)) -}
{+ SNB HSW KBL ICL: igt@gem_persistent_relocs@forked-* - timeout - GEM_BUG_ON(i915_vma_is_active(vma)) +}


  No new failures caught with the new filter
Comment 9 CI Bug Log 2019-10-18 18:12:34 UTC
The CI Bug Log issue associated to this bug has been updated.

### Removed filters

* SNB HSW KBL ICL: igt@gem_persistent_relocs@forked-* - timeout - GEM_BUG_ON(i915_vma_is_active(vma)) (added on 17 minutes ago)
Comment 10 Lakshmi 2019-10-22 06:42:29 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_390/fi-kbl-8809g/igt@gem_persistent_relocs@forked-thrashing.html
Starting subtest: forked-thrashing
(gem_persistent_relocs:2234) CRITICAL: Test assertion failure function do_test, file ../tests/i915/gem_persistent_relocs.c:259:
(gem_persistent_relocs:2234) CRITICAL: Failed assertion: test == 0xdeadbeef
(gem_persistent_relocs:2234) CRITICAL: mismatch in buffer 9: 0x00000000 instead of 0xdeadbeef at offset 128
(gem_persistent_relocs:2238) CRITICAL: Test assertion failure function do_test, file ../tests/i915/gem_persistent_relocs.c:259:
(gem_persistent_relocs:2238) CRITICAL: Failed assertion: test == 0xdeadbeef
(gem_persistent_relocs:2238) CRITICAL: mismatch in buffer 11: 0x00000000 instead of 0xdeadbeef at offset 0
Subtest forked-thrashing failed.
No log.
Subtest forked-thrashing: FAIL (68.214s)
gem_persistent_relocs: ../lib/igt_core.c:1774: igt_exit: Assertion `waitpid(-1, &tmp, WNOHANG) == -1 && errno == ECHILD' failed.
Received signal SIGABRT.
Stack trace: 
 #0 [fatal_sig_handler+0xd6]
 #1 [killpg+0x40]
 #2 [gsignal+0xc7]
 #3 [abort+0x141]
 #4 [uselocale+0x33a]
 #5 [__assert_fail+0x42]
 #6 [igt_exit+0x19e]
 #7 [main+0x2c]
 #8 [__libc_start_main+0xe7]
 #9 [_start+0x2a]

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_389/fi-icl-dsi/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
Starting subtest: forked-interruptible-thrashing
(gem_persistent_relocs:1114) CRITICAL: Test assertion failure function do_test, file ../tests/i915/gem_persistent_relocs.c:259:
(gem_persistent_relocs:1114) CRITICAL: Failed assertion: test == 0xdeadbeef
(gem_persistent_relocs:1114) CRITICAL: mismatch in buffer 4: 0x00000000 instead of 0xdeadbeef at offset 128
Subtest forked-interruptible-thrashing failed.
No log.
Subtest forked-interruptible-thrashing: FAIL (27.878s)
gem_persistent_relocs: ../lib/igt_core.c:1774: igt_exit: Assertion `waitpid(-1, &tmp, WNOHANG) == -1 && errno == ECHILD' failed.
Received signal SIGABRT.
Stack trace: 
 #0 [fatal_sig_handler+0xd6]
 #1 [killpg+0x40]
 #2 [gsignal+0xc7]
 #3 [abort+0x141]
 #4 [uselocale+0x33a]
 #5 [__assert_fail+0x42]
 #6 [igt_exit+0x19e]
 #7 [main+0x2c]
 #8 [__libc_start_main+0xe7]
 #9 [_start+0x2a]

Still happening here are the latest failures.
Comment 11 CI Bug Log 2019-10-22 06:58:38 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SNB HSW KBL ICL: igt@gem_persistent_relocs@forked(-interruptible)?-thrashing* - fail -  Failed assertion: test == 0xdeadbeef,  mismatch in buffer \d+ -}
{+ All machines: igt@gem_persistent_relocs@forked(-interruptible)?-thrashing* - fail -  Failed assertion: test == 0xdeadbeef,  mismatch in buffer \d+ +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_388/fi-skl-guc/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_388/fi-skl-lmem/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_389/fi-bdw-5557u/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_389/fi-cfl-8109u/igt@gem_persistent_relocs@forked-thrashing.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_389/fi-snb-2520m/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_390/fi-cml-s/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_390/fi-cml-u2/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_390/fi-skl-6260u/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_390/fi-skl-iommu/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5235/shard-tglb6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
Comment 12 CI Bug Log 2019-10-25 14:51:32 UTC
A CI Bug Log filter associated to this bug has been updated:

{- All machines: igt@gem_persistent_relocs@forked(-interruptible)?-thrashing* - fail -  Failed assertion: test == 0xdeadbeef,  mismatch in buffer \d+ -}
{+ All machines: igt@gem_persistent_relocs@forked(-interruptible)?-thrashing* - fail -  Failed assertion: test == 0xdeadbeef,  mismatch in buffer \d+ +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7168/shard-hsw5/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html
Comment 13 Lakshmi 2019-11-04 12:27:45 UTC
Still happening regularly
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_396/fi-kbl-soraka/igt@gem_persistent_relocs@forked-thrash-inactive.html
Starting subtest: forked-thrash-inactive
(gem_persistent_relocs:1047) CRITICAL: Test assertion failure function do_test, file ../tests/i915/gem_persistent_relocs.c:259:
(gem_persistent_relocs:1047) CRITICAL: Failed assertion: test == 0xdeadbeef
(gem_persistent_relocs:1047) CRITICAL: mismatch in buffer 3: 0x00000000 instead of 0xdeadbeef at offset 64
Subtest forked-thrash-inactive failed.
No log.
Subtest forked-thrash-inactive: FAIL (23.016s)
gem_persistent_relocs: ../lib/igt_core.c:1774: igt_exit: Assertion `waitpid(-1, &tmp, WNOHANG) == -1 && errno == ECHILD' failed.
Received signal SIGABRT.
Stack trace: 
 #0 [fatal_sig_handler+0xd6]
 #1 [killpg+0x40]
 #2 [gsignal+0xc7]
 #3 [abort+0x141]
 #4 [uselocale+0x33a]
 #5 [__assert_fail+0x42]
 #6 [igt_exit+0x19e]
 #7 [main+0x2c]
 #8 [__libc_start_main+0xe7]
 #9 [_start+0x2a]

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7242/shard-snb5/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_396/fi-icl-dsi/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
Comment 14 CI Bug Log 2019-11-04 12:52:52 UTC
A CI Bug Log filter associated to this bug has been updated:

{- All machines: igt@gem_persistent_relocs@forked(-interruptible)?-thrashing* - fail -  Failed assertion: test == 0xdeadbeef,  mismatch in buffer \d+ -}
{+ All machines: igt@gem_persistent_relocs@forked(-interruptible)?-thrashing* - fail -  Failed assertion: test == 0xdeadbeef,  mismatch in buffer \d+ +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_396/fi-kbl-soraka/igt@gem_persistent_relocs@forked-thrash-inactive.html
Comment 15 Francesco Balestrieri 2019-11-11 10:37:13 UTC
Reproduction rate is quite high and affects all platforms - setting importance accordingly.
Comment 16 Martin Peres 2019-11-29 19:41:36 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/520.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.