Summary: | [PNV Bisected]igt/gem_exec_big fails | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | lu hua <huax.lu> | ||||||||
Component: | DRM/Intel | Assignee: | Chris Wilson <chris> | ||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||
Severity: | normal | ||||||||||
Priority: | high | CC: | intel-gfx-bugs | ||||||||
Version: | unspecified | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux (All) | ||||||||||
Whiteboard: | |||||||||||
i915 platform: | i915 features: | ||||||||||
Attachments: |
|
Description
lu hua
2015-01-08 05:20:55 UTC
So far it looks like it has uncovered a nasty coherency issue between pwrite + set-to-domain + relocate + pread. Forcing pwrite to use GTT, makes it work, as does removing the user set-to-domain. It also seems to be timing dependent (but it fails quite reproducibly), adding printk also makes it work. Throwing i915_gem_chipset_flush() around is not sufficient. But... diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index c7f4048..831cef2 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -332,6 +332,8 @@ relocate_entry_gtt(struct drm_i915_gem_object *obj, if (ret) return ret; + mb(); + /* Map the page containing the relocation we're going to perform. */ offset = i915_gem_obj_ggtt_offset(obj); offset += reloc->offset; @@ -355,6 +357,7 @@ relocate_entry_gtt(struct drm_i915_gem_object *obj, io_mapping_unmap_atomic(reloc_page); + mb(); return 0; } crazily enough does. And notably diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index c7f4048..3da285a 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -880,7 +880,7 @@ vmas_move_to_rq(struct eb_vmas *eb, i915_gem_chipset_flush(rq->i915); if (flush_domains & I915_GEM_DOMAIN_GTT) - wmb(); + mb(); /* 2: invalidate the caches from this ring after emitting semaphores */ ret = i915_request_emit_flush(rq, I915_INVALIDATE_CACHES); does not. Hmm. The magic so far: diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index c7f4048..e5a6d52 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -332,6 +332,8 @@ relocate_entry_gtt(struct drm_i915_gem_object *obj, if (ret) return ret; + mb(); + /* Map the page containing the relocation we're going to perform. */ offset = i915_gem_obj_ggtt_offset(obj); offset += reloc->offset; Making the mb() unconditional at the end of set_to_gtt_domain() is ineffective. So, something must be pertubing the system between the user set-to-domain and the no-op set-to-domain during the reloc??? My hypothesis is that we need a mb() to serialise updating the GTT and accessing the PTE via the CPU. diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem index bdfd44c..c8932b3 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1444,6 +1444,7 @@ static int i915_ggtt_bind_vma(struct i915_vma *vma, BUG_ON(!i915_is_ggtt(vma->vm)); intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags); + wmb(); vma->bound = GLOBAL_BIND; vma->vm->dirty = true; is sufficient, but intel_gtt_insert_sg_entries() already ends with a readl()! Replacing the readl() with wmb() in intel_gtt_insert_sg_entries() works. I just need to explain why the readl() itself doesn't (either the compiler optimises it away or the CPU does). Created attachment 111983 [details] [review] s/readl/wmb/ Oh, and while I remember. The bug is clearly independent of 43566dedd, but because of the mb() inside set-to-gtt-domain we conveniently masked this bug i.e. the impact of the bug is non-existent, though it is probably the real reason why I had to add the mb() there in the first place. That's for a latter patch. Oh, indeedy.... commit 63256ec5347fb2344a42adbae732b90603c92f35 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Jan 4 18:42:07 2011 +0000 drm/i915: Enforce write ordering through the GTT We need to ensure that writes through the GTT land before any modification to the MMIO registers and so must impose a mandatory write barrier when flushing the GTT domain. This was revealed by relaxing the write ordering by experimentally mapping the registers and the GATT as write-combining. commit d0a57789d5ec807fc218151b2fb2de4da30fbef5 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Oct 9 19:24:37 2012 +0100 drm/i915: Only insert the mb() before updating the fence parameter With a fence, we only need to insert a memory barrier around the actual fence alteration for CPU accesses through the GTT. Performing the barrier in flush-fence was inserting unnecessary and expensive barriers for never fenced objects. Note removing the barriers from flush-fence, which was effectively a barrier before every direct access through the GTT, revealed that we where missing a barrier before the first access through the GTT. Lack of that barrier was sufficient to cause GPU hangs. Bingo. Disassembling intel_gtt_insert_sg_entries() shows that the readl() is not optimised away, so the issue appears to be that it is simply not an adequate barrier. Seems strange. Created attachment 112001 [details] [review] Change readl(addr) Even more fun. (In reply to Chris Wilson from comment #9) > Created attachment 112001 [details] [review] [review] > Change readl(addr) > > Even more fun. Test this patch, it still fail. output: IGT-Version: 1.9-g1dddf32 (i686) (Linux: 3.19.0-rc4_kcloud_d574c4_20150112+ i686) Test assertion failure function gem_execbuf, file ioctl_wrappers.c:437: Failed assertion: ret == 0 Last errno: 28, No space left on device #df -h /dev/sda2 145G 88G 50G 64% / (In reply to lu hua from comment #10) > (In reply to Chris Wilson from comment #9) > > Created attachment 112001 [details] [review] [review] [review] > > Change readl(addr) > > > > Even more fun. > > Test this patch, it still fail. > output: > IGT-Version: 1.9-g1dddf32 (i686) (Linux: 3.19.0-rc4_kcloud_d574c4_20150112+ > i686) > Test assertion failure function gem_execbuf, file ioctl_wrappers.c:437: > Failed assertion: ret == 0 > Last errno: 28, No space left on device > > #df -h > /dev/sda2 145G 88G 50G 64% / Technically that's a success. The original bug is fixed and now you hit another bug exposed by the new test. If you retest with IGT-Version: 1.9-g25cf055 to confirm the fix then file a new bug for the failure. Test igt g25cf055 and the latest igt g67eb20c. [root@x-pnv2 tests]# ./gem_exec_big IGT-Version: 1.9-g25cf055 (i686) (Linux: 3.19.0-rc4_drm-intel-nightly_823e71_20150113+ i686) Test assertion failure function exec, file gem_exec_big.c:100: Failed assertion: tmp == gem_reloc[0].presumed_offset error: 0 != 4173824 IGT-Version: 1.9-g67eb20c (i686) (Linux: 3.19.0-rc4_drm-intel-nightly_823e71_20150113+ i686) Test assertion failure function exec, file gem_exec_big.c:99: Failed assertion: tmp == gem_reloc[0].presumed_offset error: 0 != 385024 But that was with a different kernel. Did it include the patch? (In reply to Chris Wilson from comment #13) > But that was with a different kernel. Did it include the patch? Sorry, didn't apply the patch. Test this patch and igt commit 25cf055, it works well.I will report a new bug to track comment 10's fail. output: IGT-Version: 1.9-g25cf055 (i686) (Linux: 3.19.0-rc4_kcloud_d574c4_20150113+ i686) > Test this patch and igt commit 25cf055, it works well.I will report a new > bug to track comment 10's fail. bug 88392 commit 983d308cb8f602d1920a8c40196eb2ab6cc07bd2 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 26 10:47:10 2015 +0000 agp/intel: Serialise after GTT updates (In reply to Chris Wilson from comment #16) > commit 983d308cb8f602d1920a8c40196eb2ab6cc07bd2 Author: Chris Wilson > <chris@chris-wilson.co.uk> Date: Mon Jan 26 10:47:10 2015 +0000 > agp/intel: Serialise after GTT updates Case fail. ./gem_exec_big IGT-Version: 1.9-g9846e7f (i686) (Linux: 3.19.0-rc5_kcloud_983d30_20150211+ i686) (gem_exec_big:3569) ioctl-wrappers-CRITICAL: Test assertion failure function gem_execbuf, file ioctl_wrappers.c:437: (gem_exec_big:3569) ioctl-wrappers-CRITICAL: Failed assertion: ret == 0 (gem_exec_big:3569) ioctl-wrappers-CRITICAL: Last errno: 28, No space left on device FAIL (2.549s) [root@x-pnv2 tests]# uname -a Linux x-pnv2 3.19.0-rc5_kcloud_983d30_20150211+ #84 SMP Wed Feb 11 10:09:16 EST 2015 i686 i686 i386 GNU/Linux (In reply to Ding Heng from comment #17) > (In reply to Chris Wilson from comment #16) > > commit 983d308cb8f602d1920a8c40196eb2ab6cc07bd2 > Author: Chris Wilson > > <chris@chris-wilson.co.uk> > Date: Mon Jan 26 10:47:10 2015 +0000 > > > > agp/intel: Serialise after GTT updates > > Case fail. > ./gem_exec_big > IGT-Version: 1.9-g9846e7f (i686) (Linux: 3.19.0-rc5_kcloud_983d30_20150211+ > i686) > (gem_exec_big:3569) ioctl-wrappers-CRITICAL: Test assertion failure function > gem_execbuf, file ioctl_wrappers.c:437: > (gem_exec_big:3569) ioctl-wrappers-CRITICAL: Failed assertion: ret == 0 > (gem_exec_big:3569) ioctl-wrappers-CRITICAL: Last errno: 28, No space left > on device > FAIL (2.549s) > [root@x-pnv2 tests]# uname -a > Linux x-pnv2 3.19.0-rc5_kcloud_983d30_20150211+ #84 SMP Wed Feb 11 10:09:16 > EST 2015 i686 i686 i386 GNU/Linux That is the other gem_exec_big bug. Closing old verified. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.