Created attachment 113031 [details] dmesg after run forked-thrashing-hang ==System Environment== -------------------------- Regression: Yes. Non-working platforms: BYT ==kernel== -------------------------- drm-intel-nightly/8b4216f91c7bf8d3459cadf9480116220bd6545e(2015-02-02) ==Bug detailed description== ----------------------------- System hang a while after running this case, I can see alot of abnormal output in dmesg before it hang. igt/gem_reloc_vs_gpu/forked-faulting-reloc-thrashing-hang igt/gem_reloc_vs_gpu/forked-faulting-reloc-thrash-inactive-hang igt/gem_reloc_vs_gpu/forked-thrash-inactive-hang igt/gem_reloc_vs_gpu/forked-thrashing-hang [root@x-bdw01 tests]# time ./gem_reloc_vs_gpu --run-subtest forked-faulting-reloc-thrash-inactive-hang IGT-Version: 1.9-g3214a27 (x86_64) (Linux: 3.19.0-rc4_drm-intel-nightly_95cce4_20150115+ x86_64) ^C^ ==Reproduce steps== ---------------------------- 1. ./gem_reloc_vs_gpu --run-subtest forked-faulting-reloc-thrash-inactive-hang
b8d24a06568368076ebd5a858a011699a97bfa42 is the first bad commit. commit b8d24a06568368076ebd5a858a011699a97bfa42 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> AuthorDate: Wed Jan 28 17:03:14 2015 +0200 Commit: Daniel Vetter <daniel.vetter@ffwll.ch> CommitDate: Thu Jan 29 18:03:07 2015 +0100 drm/i915: Remove nested work in gpu error handling Now when we declare gpu errors only through our own dedicated hangcheck workqueue there is no need to have a separate workqueue for handling the resetting and waking up the clients as the deadlock concerns are no more. The only exception is i915_debugfs::i915_set_wedged, which triggers error handling through process context. However as this is only used through test harness it is responsibility for test harness not to introduce hangs through both debug interface and through hangcheck mechanism at the same time. Remove gpu_error.work and let the hangcheck work do the tasks it used to. v2: Add a big warning sign into i915_debugfs::i915_set_wedged (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Run gem_reloc_vs_gpu*hang cases on SNB(HNR), It takes more than 10 minutes and doesn't exit testing, it has the same bisect commit. Run time ./gem_reloc_vs_gpu --run-subtest forked-faulting-reloc-thrash-inactive-hang output(commit b8d24a): IGT-Version: 1.9-g51d87b8 (x86_64) (Linux: 3.19.0-rc5_kcloud_b8d24a_20150202+ x86_64) ^C^C real 11m17.503s user 0m0.006s sys 0m0.162s output(commit 397f6f): IGT-Version: 1.9-g51d87b8 (x86_64) (Linux: 3.19.0-rc5_kcloud_397f6f_20150202+ x86_64) (gem_reloc_vs_gpu:4121) CRITICAL: Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: (gem_reloc_vs_gpu:4121) CRITICAL: Failed assertion: test == 0xdeadbeef (gem_reloc_vs_gpu:4121) CRITICAL: mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef (gem_reloc_vs_gpu:4109) CRITICAL: Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: (gem_reloc_vs_gpu:4109) CRITICAL: Failed assertion: test == 0xdeadbeef (gem_reloc_vs_gpu:4109) CRITICAL: mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef child 12 failed with exit status 99 Subtest forked-faulting-reloc-thrash-inactive-hang: FAIL (223.664s) real 3m44.081s user 0m0.030s sys 0m1.857s
*** This bug has been marked as a duplicate of bug 88928 ***
*** This bug has been marked as a duplicate of bug 88933 ***
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.