Summary: | [CI]gem_mmap_gtt/coherency failing assertion cpu[x] == i | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Elio <elio.martinez.monroy> | ||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Severity: | critical | ||||||||
Priority: | high | CC: | armando.antoniox.mora.reos, elio.martinez.monroy, hector.franciscox.velazquez.suriano, intel-gfx-bugs, octaviox.hernandez.lopez | ||||||
Version: | DRI git | ||||||||
Hardware: | x86-64 (AMD64) | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | ReadyForDev | ||||||||
i915 platform: | BSW/CHT, BXT, CNL, GLK | i915 features: | GEM/Other | ||||||
Attachments: |
|
Description
Elio
2017-04-05 16:44:32 UTC
Expected, it demonstrates that there is a delay in posting writes via the GTT when compared to accessing the physical page directly. is this the same case for gem_mmap_gtt@swap* as well? (In reply to Elio from comment #2) > is this the same case for gem_mmap_gtt@swap* as well? Is a completely different class of bug, it should fail exactly like #100585. (In reply to Chris Wilson from comment #3) > (In reply to Elio from comment #2) > > is this the same case for gem_mmap_gtt@swap* as well? > > Is a completely different class of bug, it should fail exactly like #100585. this issue happening over GLK. is this expected ? output ============================== ./gem_mmap_gtt --r coherency --debug IGT-Version: 1.18-g56741ce (x86_64) (Linux: 4.11.0-rc4-drm-tip-qa-ww13-commit-5c 7479a+ x86_64) (gem_mmap_gtt:1736) drmtest-DEBUG: Test requirement passed: !(fd<0) (gem_mmap_gtt:1736) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/de bug/dri/0' (gem_mmap_gtt:1736) igt-core-DEBUG: Starting subtest: coherency (gem_mmap_gtt:1736) DEBUG: Test requirement passed: igt_setup_clflush() (gem_mmap_gtt:1736) CRITICAL: Test assertion failure function test_coherency, fi le gem_mmap_gtt.c:336: (gem_mmap_gtt:1736) CRITICAL: Failed assertion: cpu[x] == i (gem_mmap_gtt:1736) CRITICAL: error: 0 != 64 Stack trace: #0 [__igt_fail_assert+0x101] #1 [__real_main773+0x13b1] #2 [<unknown>+0x13b1] #3 [<unknown>+0x13b1] #4 [<unknown>+0x13b1] Subtest coherency failed. **** DEBUG **** (gem_mmap_gtt:1736) DEBUG: Test requirement passed: igt_setup_clflush() (gem_mmap_gtt:1736) CRITICAL: Test assertion failure function test_coherency, fi le gem_mmap_gtt.c:336: (gem_mmap_gtt:1736) CRITICAL: Failed assertion: cpu[x] == i (gem_mmap_gtt:1736) CRITICAL: error: 0 != 64 **** END **** Subtest coherency: FAIL (0.020s) (In reply to maria guadalupe from comment #4) > (In reply to Chris Wilson from comment #3) > > (In reply to Elio from comment #2) > > > is this the same case for gem_mmap_gtt@swap* as well? > > > > Is a completely different class of bug, it should fail exactly like #100585. > > this issue happening over GLK. is this expected ? It seems to be a feature of the Atom design since Baytrail. *** Bug 100598 has been marked as a duplicate of this bug. *** Is there any update in this bug? If so, please could you share it? Thank you. (In reply to elizabethx.de.la.torre.mena from comment #7) > Is there any update in this bug? If so, please could you share it? Thank you. The hw is failing as expected. The raison d'etre of this bug is to demonstrate the issue in hw. The following test fail on BSW with latest configuration ==================================================== Test list ==================================================== igt@gem_mmap_gtt@coherency ==================================================== Graphic Stack ==================================================== Component: drm tag: libdrm-2.4.81-24-g3095cc8 commit: 3095cc8eaba1aa87ad38c04ae2b1eabe30f7e16c Component: cairo tag: 1.15.6-2-g57b4050 commit: 57b40507dda3f58dfc8635548d606b86dc7bcf51 Component: intel-gpu-tools tag: intel-gpu-tools-1.19-57-g6fcc8e8 commit: 6fcc8e8b247661c7950b998e0b95141ffbd6b833 Component: piglit tag: piglit-v1 commit: c8f4fd9eeb298a2ef0855927f22634f794ef3eff ====================================== Hardware ====================================== platform : Braswell motherboard model : 10G9000NUS motherboard id : BRASWELL form factor : Desktop manufacturer : LENOVO cpu family : Pentium cpu family id : 6 cpu information : Intel(R) Pentium(R) CPU N3700 @ 1.60GHz gpu card : Intel Corporation Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Integrated Graphics Controller (rev 21) (prog-if 00 [VGA controller]) memory ram : 7.68 GB max memory ram : 8 GB cpu thread : 4 cpu core : 4 cpu model : 76 cpu stepping : 3 socket : Socket BGA1155 signature : Type 0, Family 6, Model 76, Stepping 3 hard drive : 476GiB (512GB) current cd clock frequency : 266667 kHz maximum cd clock frequency : 320000 kHz displays connected : DP-1 DP-3 Created attachment 132620 [details]
output
(In reply to Chris Wilson from comment #8) > (In reply to elizabethx.de.la.torre.mena from comment #7) > > Is there any update in this bug? If so, please could you share it? Thank you. > > The hw is failing as expected. The raison d'etre of this bug is to > demonstrate the issue in hw. So this should remain open until HW is changed? This test is still failing on GLK QA Tests List: igt@gem_mmap_gtt@coherency ==================================================== Output ==================================================== . . . **** DEBUG **** (gem_mmap_gtt:5993) DEBUG: Test requirement passed: igt_setup_clflush() (gem_mmap_gtt:5993) CRITICAL: Test assertion failure function test_coherency, file gem_mmap_gtt.c:335: (gem_mmap_gtt:5993) CRITICAL: Failed assertion: cpu[x] == i (gem_mmap_gtt:5993) CRITICAL: error: 0 != 64 (gem_mmap_gtt:5993) igt-core-INFO: Stack trace: (gem_mmap_gtt:5993) igt-core-INFO: #0 [__igt_fail_assert+0x101] (gem_mmap_gtt:5993) igt-core-INFO: #1 [__real_main791+0x1410] (gem_mmap_gtt:5993) igt-core-INFO: #2 [<unknown>+0x1410] (gem_mmap_gtt:5993) igt-core-INFO: #3 [<unknown>+0x1410] **** END **** . . . This is my configuration: ====================================== Graphic stack ====================================== Component: drm tag: libdrm-2.4.81-56-g7c71188 commit: 7c71188610b4ceba0339c2bc884320bcb749adee Component: cairo tag: 1.15.6-42-gdccbed7 commit: dccbed7d78d32bd3b912e8810379451dd94e6a1f Component: intel-gpu-tools tag: intel-gpu-tools-1.19-332-g0a91a5e commit: 0a91a5e9624d41d23b79e2540eda111cb56d42d9 Component: piglit tag: piglit-v1 commit: 95e2f51a28b6cf7ff77d84e1234121c98f10ef64 ====================================== Software ====================================== kernel version : 4.14.0-rc2-drm-tip-ww39-commit-d76cbbc+ hostname : GLK-2-GLKRVP1DDR405 architecture : x86_64 os version : Ubuntu 16.10 os codename : yakkety kernel driver : i915 bios revision : 62.30 bios release date : 08/22/2017 ksc : 1.41 hardware acceleration : disabled swap partition : enabled on (/dev/sda3) ====================================== Graphic drivers ====================================== grep: /opt/X11R7/var/log/Xorg.0.log: No such file or directory libdrm : 2.4.83 cairo : 1.15.9 intel-gpu-tools (tag) : intel-gpu-tools-1.19-332-g0a91a5e intel-gpu-tools (commit) : 0a91a5e ====================================== Hardware ====================================== . . . ====================================== Firmware ====================================== dmc fw loaded : yes dmc version : 1.4 guc fw loaded : SUCCESS guc version wanted : 10.56 guc version found : 10.56 huc fw loaded : yes ====================================== kernel parameters ====================================== quiet drm.debug=0xe pci=pcie_bus_safe i915.alpha_support=1 i915.enable_guc_loading=2 i915.enable_guc_submission=2 intel_iommu=igfx_off auto panic=1 nmi_watchdog=panic resume=/dev/sda3 fastboot In BUG 103079 Chris Wilson claim it is the same issue as this. So, from Ci perspective I will use this bug. At least from CI_DRM_3118 on APL-shards: (prime_vgem:2592) CRITICAL: Test assertion failure function test_gtt_interleaved, file prime_vgem.c:273: (prime_vgem:2592) CRITICAL: Failed assertion: gtt[1024*i] == ~i (prime_vgem:2592) CRITICAL: error: 0 != -1 Subtest coherency-gtt failed. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3171/shard-apl5/igt@prime_vgem@coherency-gtt.html CI_DRM_3253 GLK-shards fail: (prime_vgem:2599) CRITICAL: Test assertion failure function test_gtt_interleaved, file prime_vgem.c:273: (prime_vgem:2599) CRITICAL: Failed assertion: gtt[1024*i] == ~i (prime_vgem:2599) CRITICAL: error: 0 != -1 Subtest coherency-gtt failed. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3253/shard-glkb1/igt@prime_vgem@coherency-gtt.html Increasing the priority as it affects latest platforms. *** Bug 103079 has been marked as a duplicate of this bug. *** Since it's a hw issue we can't work around, marking as wontfix. Note: We need to make sure we don't add more machines to this one, before the case is reviewed by developers. *** Bug 104250 has been marked as a duplicate of this bug. *** *** Bug 104002 has been marked as a duplicate of this bug. *** https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3903/fi-cnl-drrs/igt@gem_mmap_gtt@coherency.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3903/fi-cnl-y3/igt@gem_mmap_gtt@coherency.html (gem_mmap_gtt:1546) CRITICAL: Test assertion failure function test_coherency, file gem_mmap_gtt.c:335: (gem_mmap_gtt:1546) CRITICAL: Failed assertion: cpu[x] == i (gem_mmap_gtt:1546) CRITICAL: error: 0 != 64 https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-bxt-dsi/igt@prime_vgem@coherency-gtt.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-glk-1/igt@prime_vgem@coherency-gtt.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-bsw-n3050/igt@gem_mmap_gtt@coherency.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-byt-j1900/igt@gem_mmap_gtt@coherency.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-byt-n2820/igt@gem_mmap_gtt@coherency.html *** Bug 104372 has been marked as a duplicate of this bug. *** kernel commit 900ccf30f9e112b508a61b228bf014e3bea14bc4 (HEAD -> drm-intel-next-queued, drm-intel/for-linux-next, drm-intel/drm-intel-next-queued) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 11:19:10 2018 +0100 drm/i915: Only force GGTT coherency w/a on required chipsets Not all chipsets have an internal buffer delaying the visibility of writes via the GGTT being visible by other physical paths, but we use a very heavy workaround for all. We only need to apply that workarounds to the chipsets we know suffer from the delay and the resulting coherency issue. Similarly, the same inconsistent coherency fouls up our ABI promise that a write into a mmap_gtt is immediately visible to others. Since the HW has made that a lie, let userspace know when that contract is broken. (Not that userspace would want to use mmap_gtt on those chipsets for other performance reasons...) Testcase: igt/drv_selftest/live_coherency Testcase: igt/gem_mmap_gtt/coherency Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100587 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180720101910.11153-1-chris@chris-wilson.co.uk igt commit 65cdccdc7bcbb791d791aeeeecb784a382110a3c (HEAD, upstream/master) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 09:02:26 2018 +0100 igt/gem_mmap_gtt: Check for known incoherency before testing We test map_gtt coherency (whether or not a write via the mmap_gtt is immediately visible in the backing storage to a read via mmap_cpu) but we know that several platforms are inherently incorrect and require some form of hammer to workaround internal delays. These platforms break our ABI guarantees and so we report the change in ABI via a driver getparam. If we know the platform doesn't meet the ABI guarantee, skip the test. If it is meant to work, test! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100587 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> commit 21eb1850fa0bd0a9b729bf3708da78888433027f Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Aug 1 11:47:21 2018 +0100 drm/i95: Mark GGTT as incoherent for gen10+ The evidence suggests that we need to start treating writes via GGTT as incoherent for gen10+, that is that they are internally buffered and not immediately visible via a read along a different physical path. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107398 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107400 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107435 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180801104721.4030-1-chris@chris-wilson.co.uk The following platforms are also incoherent: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_101/fi-blb-e6850/igt@prime_vgem@coherency-gtt.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_101/fi-pnv-d510/igt@prime_vgem@coherency-gtt.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_101/fi-elk-e7500/igt@prime_vgem@coherency-gtt.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_101/fi-bwr-2160/igt@prime_vgem@coherency-gtt.html (prime_vgem:1363) CRITICAL: Test assertion failure function test_gtt_interleaved, file ../tests/prime_vgem.c:314: (prime_vgem:1363) CRITICAL: Failed assertion: gtt[1024*i] == ~i (prime_vgem:1363) CRITICAL: error: 0 != -1 Subtest coherency-gtt failed. The other platforms have indeed been fixed/silenced. Different test; hosts that don't show the coherency issue in the specific test for it => different bug. (In reply to Chris Wilson from comment #29) > Different test; hosts that don't show the coherency issue in the specific > test for it => different bug. For example, our previous issue was that the indirect write via the GGTT was being buffered, but in this case it's the WC writes that aren't immediately visible. The first half of the loop (writing into GTT, reading via WC) works without any sync. Argh! To be more precise the problem on my i915gm is that I get a WB vgem mmap; so obviously it is not being flushed to system memory immediately. (In reply to Chris Wilson from comment #29) > Different test; hosts that don't show the coherency issue in the specific > test for it => different bug. Moved to https://bugs.freedesktop.org/show_bug.cgi?id=107862. Thanks for your explanation! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.