Summary: | [BYT clflush] gem_exec_flush stress tests fail | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Rami <ramix.ben.hassine> | ||||||||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||||
Severity: | normal | ||||||||||||||
Priority: | medium | CC: | christophe.prigent, daniela.doras-prodan, elio.martinez.monroy, intel-gfx-bugs | ||||||||||||
Version: | unspecified | ||||||||||||||
Hardware: | Other | ||||||||||||||
OS: | All | ||||||||||||||
Whiteboard: | |||||||||||||||
i915 platform: | BYT | i915 features: | GEM/Other | ||||||||||||
Attachments: |
|
setup: ============= Hardware: Platform: Braswell M CPU : Intel(R) Celeron N3060 1.60GHz @ 1.6 GHz (family: 6, model: 76 stepping: 4) SoC : BSW C0 QDF : K6XC CRB : BRASWELL RVP Fab2 Mandatory Reworks : All Feature Reworks: F28, F32, F33, F35, F37 Optional reworks : O-01a; O-02, O-03 Software : Linux distribution: Ubuntu 14.04 LTS 64 bits BIOS : BRAS.X64.B084.R00.1508310642 TXE FW : 2.0.0.2073 Ksc : 1.08 kernel 4.3.0-rc7-nightly+ commit 86ba603f327626055fe1436112b3786eaaaf7fb1 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sat Oct 31 09:27:21 2015 +0100 drm-intel-nightly: 2015y-10m-31d-08h-26m-39s UTC integration manifest cairo: (HEAD, tag: 1.14.2) 93422b3cb5e0ef8104b8194c8873124ce2f5ea2d from git://git.freedesktop.org/git/cairo drm: (HEAD, tag: libdrm-2.4.65, tag: 2.4.65) c3496167637e35cf8a52d5e7e53a412e79d80db0 from git://git.freedesktop.org/git/mesa/drm intel-driver: (HEAD, tag: 1.6.1, origin/v1.6-branch) 35858c69166b845c59ca32e19a3dbb0b758df209 from git://git.freedesktop.org/git/vaapi/intel-driver libva: (HEAD, tag: libva-1.6.1, origin/v1.6-branch) 613eb962b45fbbd1526d751e88e0d8897af6c0e0 from git://git.freedesktop.org/git/vaapi/libva mesa: (HEAD, tag: mesa-11.0.4) 31bf24703193cc23961923e01548b1acb2760a93 from git://git.freedesktop.org/git/mesa/mesa xf86-video-intel: (HEAD, tag: 2.99.917) baec802b21387d04aebb10ac29e719a1800c5aa0 from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel xserver: (HEAD, tag: xorg-server-1.17.2) 2123f7682d522619f101b05fb75efa75dabbe371 from git://git.freedesktop.org/git/xorg/xserver * Tools * intel-gpu-tools: (HEAD, origin/master, origin/HEAD, master) bfea74a9f64a900bcb90f946b38746781017449f from git://git.freedesktop.org/git/xorg/app/intel-gpu-tools Please try with noclflush on the kernel command line. it fixe it all subtest pass For posterity: diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 3adb163..3187e97 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -517,9 +517,15 @@ shmem_pread_fast(struct page *page, int shmem_page_offset, int page_length, return -EINVAL; vaddr = kmap_atomic(page); - if (needs_clflush) - drm_clflush_virt_range(vaddr + shmem_page_offset, - page_length); + if (needs_clflush) { + unsigned long start = (unsigned long)vaddr + shmem_page_offset; + unsigned long end = start + page_length; + + start = round_down(start, 256); + end = round_up(end, 256); + + drm_clflush_virt_range((void *)start, end - start); + } Pineview is not affected, but Baytrail and Braswell are. Leaning towards the code is correct... Another data point, I have a second testcase for which the +-4 cacheline flush is ineffective (but wbinvd remains so), gem_concurrent_blit/*partial* I've trimmed it down to diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 6000ad7..54683f9 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -141,6 +141,8 @@ void clflush_cache_range(void *vaddr, unsigned int size) for (; p < vend; p += clflush_size) clflushopt(p); + clflushopt(vaddr); + mb(); } EXPORT_SYMBOL_GPL(clflush_cache_range); (with the proviso of converting us over to clflush_cache_range). Created attachment 122297 [details]
bsw-dmesg-gem_partial_pwrite_pread
Still reproduced. See bsw-dmesg-gem_partial_pwrite_pread fro more details.
Assigned to Rami to check code from Chris.
*** Bug 94547 has been marked as a duplicate of this bug. *** *** Bug 92501 has been marked as a duplicate of this bug. *** fwiw, I have written a test that seems to disprove the clflush theory. (gem_concurrent_blit:714) CRITICAL: Test assertion failure function prw_cmp_bo, file gem_concurrent_all.c:112: (gem_concurrent_blit:714) CRITICAL: Failed assertion: vaddr[i] == val (gem_concurrent_blit:714) CRITICAL: error: 0xdeadbeef != 0xdeadbeec Stack trace: #0 [__igt_fail_assert+0xf1] #1 [prw_cmp_bo+0x8b] #2 [<unknown>+0x8b] child 1 failed with exit status 99 Subtest 16MiB-tiny-prw-render-write-read-bcs-forked failed. **** DEBUG **** (gem_concurrent_blit:610) ioctl-wrappers-DEBUG: Test requirement passed: __gem_set_caching(fd, handle, caching) == 0 **** END **** Subtest 16MiB-tiny-prw-render-write-read-bcs-forked: FAIL (0.873s) et al. All test only passes on BSW if noclflush is added to the kernel command line. This test work for me eith the following configuration: Software Configuration ================================================ Linux distribution: Ubuntu 15.10 64 bits Kernel: drm-intel-nightly 4.6.0-rc3_d9131d6 from http://cgit.freedesktop.org/drm-intel/ commit d9131d62d18ba94fb3ca019f1156c22b5f4ce23c Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Fri Apr 15 14:54:26 2016 +0100 drm-intel-nightly: 2016y-04m-15d-13h-53m-44s UTC integration manifestdrm: tag libdrm-2.4.66-33-gf884af9 libdrm 2.4.67-25 cc9a53f from git://git.freedesktop.org/git/mesa/drm mesa 11.1.2 7bcd827 from git://git.freedesktop.org/git/mesa/mesa cairo 1.15.2 db8a7f1 from git://git.freedesktop.org/git/cairo xorg/xserver 1.18.0-274 8437955 from git://git.freedesktop.org/git/xorg/xserver xf86-video-intel 2.99.917-634 81029be from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel vaapi/libva 1.7.0-1 2339d10 from git://git.freedesktop.org/git/vaapi/libva vaapi/intel-driver 1.7.0-8 2c1bec0 from git://git.freedesktop.org/git/vaapi/intel-driver intel-gpu-tool 1.14 7bd2ac6 from http://anongit.freedesktop.org/git/xorg/app/intel-gpu-tools.git rendercheck master 44032a7 from http://anongit.freedesktop.org/git/xorg/app/rendercheck.git You can also find the subtests wich are failing if the "noclflush" is not added to the kernel command line, see the attachment "pwrite_WO_noclflush.log". Created attachment 123157 [details]
results without "noclflush" command line
This is the result if no command line is added to the kernel
Written a new test case: igt/gem_exec_flush. Please could you run this on byt/bsw/bxt and attach the output? The duct-tape has been merged: commit 396f5d62d1a5fd99421855a08ffdef8edb43c76e Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jul 7 09:41:12 2016 +0100 drm: Restore double clflush on the last partial cacheline This effectively reverts commit afcd950cafea6e27b739fe7772cbbeed37d05b8b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jun 10 15:58:01 2015 +0100 drm: Avoid the double clflush on the last cache line in drm_clflush_virt_range() (It's not totally foolproof, but hopefully good enough.) *** Bug 95382 has been marked as a duplicate of this bug. *** 12 subtests are Pass. So closed Hardware: Acer Desktop Motherboard: Aspire XC-704 CPU: Intel(R) Pentium(R) CPU N3700 @ 1.60GHz (Family 6, Model 76, Stepping 3) GPU: IntelĀ® HD Graphics - Intel Corporation Device 22b1 (rev 21) Memory card: 1 card 4GB Hynix HMT451S6BFR8APB HDD: Western Digital WDC WD10EZEX-21M (1TB) Software: Bios: R01-A2 Linux distribution: Ubuntu 16.04 64 bits Kernel: 4.8.0-rc2 d93dacb from http://cgit.freedesktop.org/drm-intel/ commit d93dacb87b32a866cfe552081ee084698d3a4fa6 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Aug 16 18:53:05 2016 +0200 drm-intel-nightly: 2016y-08m-16d-16h-52m-49s UTC integration manifest libdrm-2.4.70-2 b214b05 from git://anongit.freedesktop.org/mesa/drm mesa: mesa-11.2.2 3a9f628from git://anongit.freedesktop.org/mesa/mesa cairo 1.15.2 db8a7f1 from git://anongit.freedesktop.org/cairo xorg-server-1.18.0- 532 6e5bec2 from git://git.freedesktop.org/git/xorg/xserver xf86-video-intel 2.99.697 12c14de from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel libva-1.7.0-45 b27feb9 from git://git.freedesktop.org/git/vaapi/libva vaapi-intel-driver: 1.7.0-89 b53fad9 from git://git.freedesktop.org/git/vaapi/intel-driver Intel-Gpu-Tools 1.15 a147ef2 from http://anongit.freedesktop.org/git/xorg/app/intel-gpu-tools.git The clflush issue is still there as far the stress tests are concerned. Removing BSW from the i915 platform since all tests are pass. drm-intel-qa 4.11.0-rc1 e060007 BSW (BSW101) igt@gem_partial_pwrite_pread@reads Pass igt@gem_partial_pwrite_pread@reads-display Pass igt@gem_partial_pwrite_pread@reads-snoop Pass igt@gem_partial_pwrite_pread@reads-uncached Pass igt@gem_partial_pwrite_pread@write Pass igt@gem_partial_pwrite_pread@write-display Pass igt@gem_partial_pwrite_pread@write-snoop Pass igt@gem_partial_pwrite_pread@write-uncached Pass igt@gem_partial_pwrite_pread@writes-after-reads Pass igt@gem_partial_pwrite_pread@writes-after-reads-display Pass igt@gem_partial_pwrite_pread@writes-after-reads-snoop Pass igt@gem_partial_pwrite_pread@writes-after-reads-uncached Pass Chris - if there is another set of tests (stress tests for cflush issue, which are not passing in BSW - let's create another bug for those. For BYT these tests are still FAIL. 4.11.0-rc3 a4d4230 BYT (BYT3) igt@gem_partial_pwrite_pread@reads Fail igt@gem_partial_pwrite_pread@reads-display Fail igt@gem_partial_pwrite_pread@reads-snoop Fail igt@gem_partial_pwrite_pread@reads-uncached Fail igt@gem_partial_pwrite_pread@write Fail igt@gem_partial_pwrite_pread@write-display Fail igt@gem_partial_pwrite_pread@write-snoop Fail igt@gem_partial_pwrite_pread@write-uncached Fail igt@gem_partial_pwrite_pread@writes-after-reads Fail igt@gem_partial_pwrite_pread@writes-after-reads-display Fail igt@gem_partial_pwrite_pread@writes-after-reads-snoop Fail igt@gem_partial_pwrite_pread@writes-after-reads-uncached Fail It's this bug, just remove the workarounds to uncover it again. (In reply to Jari Tahvanainen from comment #19) > For BYT these tests are still FAIL. > 4.11.0-rc3 a4d4230 BYT (BYT3) > igt@gem_partial_pwrite_pread@reads Fail > igt@gem_partial_pwrite_pread@reads-display Fail > igt@gem_partial_pwrite_pread@reads-snoop Fail > igt@gem_partial_pwrite_pread@reads-uncached Fail > igt@gem_partial_pwrite_pread@write Fail > igt@gem_partial_pwrite_pread@write-display Fail > igt@gem_partial_pwrite_pread@write-snoop Fail > igt@gem_partial_pwrite_pread@write-uncached Fail > igt@gem_partial_pwrite_pread@writes-after-reads Fail > igt@gem_partial_pwrite_pread@writes-after-reads-display Fail > igt@gem_partial_pwrite_pread@writes-after-reads-snoop Fail > igt@gem_partial_pwrite_pread@writes-after-reads-uncached Fail That I suspect is a bug in your end. (In reply to Chris Wilson from comment #20) > It's this bug, just remove the workarounds to uncover it again. Workaround are not used on BSW. I will re-check BYT ... and provide logs. Does this help to figure out what is the problem in BYT? For some reason dmesg is not available for this particular run ... Results for igt@gem_partial_pwrite_pread@reads IGT-Version: 1.18-g0aef486 (x86_64) (Linux: 4.11.0-rc3-tip-201703201600+ x86_64) checking partial reads Stack trace: #0 [__igt_fail_assert+0xf1] #1 [intel_batchbuffer_flush_on_ring+0xc5] #2 [do_tests+0x1ac] #3 [__real_main250+0x120] #4 [main+0x35] #5 [__libc_start_main+0xf0] #6 [_start+0x29] #7 [<unknown>+0x29] Subtest reads: FAIL (0.186s) (gem_partial_pwrite_pread:17828) intel-batchbuffer-CRITICAL: Test assertion failure function intel_batchbuffer_flush_on_ring, file intel_batchbuffer.c:184: (gem_partial_pwrite_pread:17828) intel-batchbuffer-CRITICAL: Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used, ring)) == 0 (gem_partial_pwrite_pread:17828) intel-batchbuffer-CRITICAL: Last errno: 5, Input/output error Subtest reads failed. **** DEBUG **** (gem_partial_pwrite_pread:17828) INFO: checking partial reads (gem_partial_pwrite_pread:17828) intel-batchbuffer-CRITICAL: Test assertion failure function intel_batchbuffer_flush_on_ring, file intel_batchbuffer.c:184: (gem_partial_pwrite_pread:17828) intel-batchbuffer-CRITICAL: Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used, ring)) == 0 (gem_partial_pwrite_pread:17828) intel-batchbuffer-CRITICAL: Last errno: 5, Input/output error **** END **** (In reply to Jari Tahvanainen from comment #23) > Does this help to figure out what is the problem in BYT? For some reason > dmesg is not available for this particular run ... > Good Afternoon Jari, Is there any update on this case? Thank you. The following tests Pass on BYT with latest configuration ====================================== gt@gem_partial_pwrite_pread@reads igt@gem_partial_pwrite_pread@reads-display igt@gem_partial_pwrite_pread@reads-snoop igt@gem_partial_pwrite_pread@reads-uncached igt@gem_partial_pwrite_pread@write igt@gem_partial_pwrite_pread@write-display igt@gem_partial_pwrite_pread@write-snoop igt@gem_partial_pwrite_pread@write-uncached igt@gem_partial_pwrite_pread@writes-after-reads igt@gem_partial_pwrite_pread@writes-after-reads-display igt@gem_partial_pwrite_pread@writes-after-reads-snoop igt@gem_partial_pwrite_pread@writes-after-reads-uncached ===================================== Using the following configuration: ====================================== Software ====================================== kernel version : 4.13.0-rc4-drm-tip-ww32-commit-3d87f89+ hostname : BYT-1 architecture : x86_64 os version : Ubuntu 16.10 os codename : yakkety kernel driver : i915 bios revision : 5.6 bios release date : 03/03/2017 hardware acceleration : disabled swap partition : enabled on (/dev/sda3) ====================================== Graphic drivers ====================================== grep: /opt/X11R7/var/log/Xorg.0.log: No such file or directory libdrm : 2.4.70 libva : 1.7.1-2 vaapi (intel-driver) : 1.7.1 cairo : 1.14.6-1build1 intel-gpu-tools : 1.16-1 ====================================== Hardware ====================================== motherboard model : ................................. motherboard id : DN2820FYK form factor : Desktop manufacturer : ................................. cpu family : Celeron cpu family id : 6 cpu information : Intel(R) Celeron(R) CPU N2830 @ 2.16GHz gpu card : Intel Corporation Atom Processor Z36xxx/Z37xxx Series Graphics & Display (rev 0e) (prog-if 00 [VGA controller]) memory ram : 7.66 GB max memory ram : 8 GB cpu thread : 2 cpu core : 2 cpu model : 55 cpu stepping : 8 socket : <OUT OF SPEC> signature : Type 0, Family 6, Model 55, Stepping 8 hard drive : 111GiB (120GB) current cd clock frequency : 266667 kHz maximum cd clock frequency : 400000 kHz displays connected : HDMI-A-1 adding Log of results and kern.log Created attachment 133412 [details]
Log.log
Created attachment 133413 [details]
kern.log
|
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 119440 [details] dmesg Reproduce steps: ================ ./gem_partial_pwrite_pread Actual Results: IGT-Version: 1.12-gbfea74a (x86_64) (Linux: 4.3.0-rc7-nightly+ x86_64) checking partial reads partial reads test: 86%Test assertion failure function test_partial_reads, file gem_partial_pwrite_pread.c:123: Failed assertion: tmp[j] == val mismatch at 4542, got: 89, expected: 94 Stack trace: #0 [__igt_fail_assert+0x101] #1 [do_tests+0x62c] #2 [_start+0x0] Subtest reads failed. **** DEBUG **** checking partial reads Test assertion failure function test_partial_reads, file gem_partial_pwrite_pread.c:123: Failed assertion: tmp[j] == val mismatch at 4542, got: 89, expected: 94 **** END **** Subtest reads: FAIL (0.356s) checking partial writes partial writes test: 100% Subtest write: SUCCESS (8.016s) checking partial writes after partial reads partial read/writes test: 4%Test assertion failure function test_partial_read_writes, file gem_partial_pwrite_pread.c:196: Failed assertion: tmp[j] == val Last errno: 22, Invalid argument mismatch in read at 9006, got: 44, expected: 45 Stack trace: #0 [__igt_fail_assert+0x101] #1 [do_tests+0x5fd] #2 [_start+0x0] Subtest writes-after-reads failed. **** DEBUG **** checking partial writes after partial reads Test assertion failure function test_partial_read_writes, file gem_partial_pwrite_pread.c:196: Failed assertion: tmp[j] == val Last errno: 22, Invalid argument mismatch in read at 9006, got: 44, expected: 45 **** END **** Subtest writes-after-reads: FAIL (0.364s) checking partial reads partial reads test: 0%Test assertion failure function test_partial_reads, file gem_partial_pwrite_pread.c:123: Failed assertion: tmp[j] == val mismatch at 2324, got: 8, expected: 9 Stack trace: #0 [__igt_fail_assert+0x101] #1 [do_tests+0x62c] #2 [_start+0x0] Subtest reads-uncached failed. **** DEBUG **** checking partial reads Test assertion failure function test_partial_reads, file gem_partial_pwrite_pread.c:123: Failed assertion: tmp[j] == val mismatch at 2324, got: 8, expected: 9 **** END **** Subtest reads-uncached: FAIL (0.004s) checking partial writes partial writes test: 100% Subtest write-uncached: SUCCESS (8.016s) checking partial writes after partial reads partial read/writes test: 9%Test assertion failure function test_partial_read_writes, file gem_partial_pwrite_pread.c:196: Failed assertion: tmp[j] == val mismatch in read at 1635, got: 91, expected: 93 Stack trace: #0 [__igt_fail_assert+0x101] #1 [do_tests+0x5fd] #2 [_start+0x0] Subtest writes-after-reads-uncached failed. **** DEBUG **** checking partial writes after partial reads Test assertion failure function test_partial_read_writes, file gem_partial_pwrite_pread.c:196: Failed assertion: tmp[j] == val mismatch in read at 1635, got: 91, expected: 93 **** END **** Subtest writes-after-reads-uncached: FAIL (0.752s) checking partial reads partial reads test: 100% Subtest reads-snoop: SUCCESS (0.100s) checking partial writes partial writes test: 100% Subtest write-snoop: SUCCESS (8.008s) checking partial writes after partial reads partial read/writes test: 100% Subtest writes-after-reads-snoop: SUCCESS (8.040s) checking partial reads partial reads test: 13%Test assertion failure function test_partial_reads, file gem_partial_pwrite_pread.c:123: Failed assertion: tmp[j] == val mismatch at 4170, got: 134, expected: 137 Stack trace: #0 [__igt_fail_assert+0x101] #1 [do_tests+0x62c] #2 [_start+0x0] Subtest reads-display failed. **** DEBUG **** checking partial reads Test assertion failure function test_partial_reads, file gem_partial_pwrite_pread.c:123: Failed assertion: tmp[j] == val mismatch at 4170, got: 134, expected: 137 **** END **** Subtest reads-display: FAIL (0.016s) checking partial writes partial writes test: 100% Subtest write-display: SUCCESS (8.016s) checking partial writes after partial reads partial read/writes test: 3%Test assertion failure function test_partial_read_writes, file gem_partial_pwrite_pread.c:196: Failed assertion: tmp[j] == val mismatch in read at 3526, got: 29, expected: 36 Stack trace: #0 [__igt_fail_assert+0x101] #1 [do_tests+0x5fd] #2 [_start+0x0] Subtest writes-after-reads-display failed. **** DEBUG **** checking partial writes after partial reads Test assertion failure function test_partial_read_writes, file gem_partial_pwrite_pread.c:196: Failed assertion: tmp[j] == val mismatch in read at 3526, got: 29, expected: 36 **** END **** Subtest writes-after-reads-display: FAIL (0.288s)