Created attachment 105948 [details] dmesg ==System Environment== -------------------------- Regression: Yes. Bisected Non-working platforms: PNV/BYT ==kernel== -------------------------- origin/drm-intel-nightly: 4a3d32734bdcef6813b31f06a58430436e98711e(fails) drm-intel-nightly: 2014y-09m-08d-18h-33m-01s integration manifest origin/drm-intel-next-queued: a2ca46441decdcdf4010f1db8a7041c8851327b3(fails) drm/i915: split intel_primary_plane_setplane() into check() and commit() origin/drm-intel-fixes: 7a98948f3b536ca9a077e84966ddc0e9f53726df(fails) drm/i915: Wait for vblank before enabling the TV encoder ==Bug detailed description== ----------------------------- igt/gem_persistent_relocs and igt/gem_reloc_vs_gpu some subcases timeout Case list: igt/gem_persistent_relocs/forked-faulting-reloc-thrash-inactive igt/gem_persistent_relocs/forked-interruptible-faulting-reloc-thrash-inactive igt/gem_persistent_relocs/forked-interruptible-thrash-inactive igt/gem_persistent_relocs/forked-thrash-inactive igt/gem_reloc_vs_gpu/forked-faulting-reloc-thrash-inactive igt/gem_reloc_vs_gpu/forked-interruptible-faulting-reloc-thrash-inactive igt/gem_reloc_vs_gpu/forked-interruptible-thrash-inactive igt/gem_reloc_vs_gpu/forked-thrash-inactive Output: root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# time ./gem_persistent_relocs --run-subtest forked-interruptible-faulting-reloc-thrash-inactive IGT-Version: 1.7-gac3d060 (x86_64) (Linux: 3.17.0-rc4_drm-intel-nightly_4a3d32_20140909+ x86_64) ^C^C^C ^C^C ^C ^Z [1]+ Stopped ./gem_persistent_relocs --run-subtest forked-interruptible-faulting-reloc-thrash-inactive real 10m53.109s user 0m0.000s sys 0m0.000s ==Reproduce steps== ---------------------------- 1. time ./gem_persistent_relocs --run-subtest forked-interruptible-faulting-reloc-thrash-inactive ==Bisect results== ---------------------------- Bisect shows: 4ad72b7fadd285f849439cdbc408f8b847cef704 is the first bad commit commit 4ad72b7fadd285f849439cdbc408f8b847cef704 Author: Chris Wilson <chris@chris-wilson.co.uk> AuthorDate: Wed Sep 3 19:23:37 2014 +0100 Commit: Daniel Vetter <daniel.vetter@ffwll.ch> CommitDate: Thu Sep 4 09:56:07 2014 +0200 drm/i915: Fix unsafe vma iteration in i915_drop_caches When unbinding, there is a possibility that we drop the active reference on the object, thereby freeing it. If that happens, we may destroy the vm link as well as the object and vma. So iterate carefully. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It impacts all platforms.
Object disappears during unbind. The only question is how that only just affected you.
I still have no idea how this is the first time it showed up, but commit ab4a7b96c7c9980f306730eee7667639d6221ef2 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 9 11:16:08 2014 +0100 drm/i915: Objects on the unbound list may still have an active reference Due to the lazy retirement semantics, even though we have unbound an object, it may still hold onto an active reference. So in the debug code, play safe. v2: Export i915_gem_shrink() rather than opencoding it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> should be the last fix we ever need for drop_caches (tm)! And I think commit ace110dfad1b9ac2c724e1c1251c0faa8a408fa1 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 9 07:02:43 2014 +0100 drm/i915: Drop any active reference before unbinding Before we process the final unbind on an object and move it to the unbound list, it is semantically cleaner if there are no more active references to the object. (An active reference would imply that it was still being accessed by the GPU after it became inaccessible.) The caveat is that all callsites must be prepared for the object to disappeared during the unbind - i.e. they must hold their own reference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> is the root cause.
Verified. root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests#./gem_persistent_relocs --run-subtest forked-interruptible-faulting-reloc-thrash-inactive IGT-Version: 1.8-g107151c (x86_64) (Linux: 3.17.0-rc4_drm-intel-nightly_99f444_20140910+ x86_64) Subtest forked-interruptible-faulting-reloc-thrash-inactive: SUCCESS (4.306s) root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_reloc_vs_gpu --run-subtest forked-faulting-reloc-thrash-inactive IGT-Version: 1.8-g107151c (x86_64) (Linux: 3.17.0-rc4_drm-intel-nightly_99f444_20140910+ x86_64) Subtest forked-faulting-reloc-thrash-inactive: SUCCESS (4.387s)
Closing old verified+fixed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.