Created attachment 100445 [details] dmesg ==System Environment== -------------------------- Regression: Yes. Good commit on -next-queued: 192155025197cc4765702a180904c3b62c152b7a Non-working platforms: BYT ==kernel== -------------------------- origin/drm-intel-nightly: 0a37b5d366831590ebc976018d1bd812ef526a98(fails) drm-intel-nightly: 2014y-06m-03d-19h-31m-28s integration manifest origin/drm-intel-next-queued: 92d7377929140bc120f7742ee3afffcb2a827fe4(fails) drm/i915: Simplify intel_gpu_reset origin/drm-intel-fixes: d23db88c3ab233daed18709e3a24d6c95344117f(fails) drm/i915: Prevent negative relocation deltas from wrapping ==Bug detailed description== ----------------------------- igt/gem_exec_big fails Output: ./gem_exec_big IGT-Version: 1.6-g1451df1 (x86_64) (Linux: 3.15.0-rc3_drm-intel-next-queued_06946f_20140605+ x86_64) Test assertion failure function exec, file gem_exec_big.c:95: Last errno: 0, Success Failed assertion: tmp == gem_reloc[0].presumed_offset ==Reproduce steps== ---------------------------- 1. ./gem_exec_big
Update result on -fixes origin/drm-intel-fixes: d23db88c3ab233daed18709e3a24d6c95344117f(works) drm/i915: Prevent negative relocation deltas from wrapping
Want to bet this is ppgtt enabling? Please try i915.enable_ppgtt=0, otherwise please bisect.
(In reply to comment #2) > Want to bet this is ppgtt enabling? Please try i915.enable_ppgtt=0, > otherwise please bisect. Disable ppgtt, the result is passed.
Thanks for Chris's comment. So I'd assume we don't need bisect.
Created attachment 100505 [details] [review] clflush ptes Please test this - I don't have a byt myself so can't check myself ...
Also I'd like to see 'lspci -n' for the affected machine. The theory is that the stepping is a factor here since it worked for me and Jesse using production stepping machines.
Created attachment 100691 [details] dmesg with patch (In reply to comment #5) > Created attachment 100505 [details] [review] [review] > clflush ptes > > Please test this - I don't have a byt myself so can't check myself ... With this patch, the result was fail. Output: ./gem_exec_big IGT-Version: 1.6-g18d2130 (x86_64) (Linux: 3.15.0-rc3_kcloud_10dca6_20140609+ x86_64) Test assertion failure function exec, file gem_exec_big.c:95: Last errno: 0, Success Failed assertion: tmp == gem_reloc[0].presumed_offset
(In reply to comment #6) > Also I'd like to see 'lspci -n' for the affected machine. The theory is that > the stepping is a factor here since it worked for me and Jesse using > production stepping machines. lspci -n 00:00.0 0600: 8086:0f00 (rev 0a) 00:02.0 0300: 8086:0f31 (rev 0a) 00:13.0 0106: 8086:0f23 (rev 0a) 00:14.0 0c03: 8086:0f35 (rev 0a) 00:1a.0 1080: 8086:0f18 (rev 0a) 00:1b.0 0403: 8086:0f04 (rev 0a) 00:1c.0 0604: 8086:0f48 (rev 0a) 00:1c.1 0604: 8086:0f4a (rev 0a) 00:1c.2 0604: 8086:0f4c (rev 0a) 00:1c.3 0604: 8086:0f4e (rev 0a) 00:1f.0 0601: 8086:0f1c (rev 0a) 00:1f.3 0c05: 8086:0f12 (rev 0a) 01:00.0 0200: 8086:107d (rev 06)
Bisect result would still be good here ...
Meh, already checked for ppgtt, no need for bisect. Sorry about the noise
Ok looks like an early stepping. We can disable PPGTT on pre-rev C and hope for better luck...
Please try this patch http://lists.freedesktop.org/archives/intel-gfx/2014-June/047137.html
commit 62942ed7279d3e06dc15ae3d47665eff3b373327 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Fri Jun 13 09:28:33 2014 -0700 drm/i915/vlv: disable PPGTT on early revs v3 Do you have any production silicon byt machines around? Otherwise testing will lack coverage ...
Tested on latest -next-queued 868d665b43473e230d560d5186535270a3d57a19(which include the patch 047137), the result was passed. Output: root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_exec_big IGT-Version: 1.7-g8c1566e (x86_64) (Linux: 3.15.0-rc8_kcloud_868d66_20140616+ x86_64) root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# echo $? 0
I guess we can mark this fixed then.
Created attachment 101280 [details] dmesg The case still fail on latest -next-queued. I am not sure if they are the same failure. root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_exec_big IGT-Version: 1.7-g1b1f4b1 (x86_64) (Linux: 3.15.0-rc8_drm-intel-next-queued_27b6c1_20140618+ x86_64) Test assertion failure function exec, file gem_exec_big.c:90: Last errno: 0, Success Failed assertion: tmp == gem_reloc[0].presumed_offset error: 17043456 == -1
Please retest on latest -nightly, that should have a fix.
Created attachment 101329 [details] dmesg (In reply to comment #17) > Please retest on latest -nightly, that should have a fix. The result on latest -nightly still fail. Output: [root@x-hsw27 tests]# ./gem_exec_big IGT-Version: 1.7-g1b1f4b1 (x86_64) (Linux: 3.15.0-rc8_drm-intel-nightly_fff6c5_20140618+ x86_64) Test assertion failure function exec, file gem_exec_big.c:90: Last errno: 0, Success Failed assertion: tmp == gem_reloc[0].presumed_offset error: 7974912 == -1
(In reply to comment #16) > Created attachment 101280 [details] > dmesg > > The case still fail on latest -next-queued. I am not sure if they are the > same failure. > root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_exec_big > IGT-Version: 1.7-g1b1f4b1 (x86_64) (Linux: > 3.15.0-rc8_drm-intel-next-queued_27b6c1_20140618+ x86_64) > Test assertion failure function exec, file gem_exec_big.c:90: > Last errno: 0, Success > Failed assertion: tmp == gem_reloc[0].presumed_offset > error: 17043456 == -1 Auto-bisect shows commit below is the first bad commit about the failure above. commit eb36fc993d7ae1988c80ba5b767989059c91d0ec Author: Chris Wilson <chris@chris-wilson.co.uk> AuthorDate: Mon Jun 16 10:49:16 2014 +0100 Commit: Chris Wilson <chris@chris-wilson.co.uk> CommitDate: Mon Jun 16 10:51:02 2014 +0100 igt/gem_exec_big: Update to new igt_assert_eq Use igt_assert_eq for better test output on failures. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
commit 236d6bd2d36114fe402fe0e85d97b14cdf102963 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Jun 19 10:13:12 2014 +0200 tests/gem_exec_big: Re-add gem_sync We need this to avoid hitting the slowpath and ending up with a presumed_offset == -1. Regression reported by PRTS, bisected to commit eb36fc993d7ae1988c80ba5b767989059c91d0ec Author: Chris Wilson <chris@chris-wilson.co.uk> AuthorDate: Mon Jun 16 10:49:16 2014 +0100 Commit: Chris Wilson <chris@chris-wilson.co.uk> CommitDate: Mon Jun 16 10:51:02 2014 +0100 igt/gem_exec_big: Update to new igt_assert_eq Use igt_assert_eq for better test output on failures. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> v2: igt_warn_on unexpected reloc offsets. Cc: shuang.he@intel.com Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (on irc) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.c
Verified on latest -nightly(8734408c113bb38234ed03ec51c723b3deff579b) [root@x-bdw01 tests]# ./gem_exec_big IGT-Version: 1.7-g4d4f4b2 (x86_64) (Linux: 3.16.0-rc5_drm-intel-nightly_873440_20140721+ x86_64) [root@x-bdw01 tests]# echo $? 0
Created attachment 103384 [details] dmesg Test still failed on latest -nightly(af1aaba219fdd90ca1b30f9b8d8d19352224f170) on BYT root@x-bytm02:~# cd /GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests/ root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_exec_big IGT-Version: 1.7-g70e6ed9 (x86_64) (Linux: 3.16.0-rc6_drm-intel-nightly_af1aab_20140724+ x86_64) Test assertion failure function exec, file gem_exec_big.c:97: Failed assertion: tmp == gem_reloc[0].presumed_offset error: 0 == 8908800
That's not going to be the same bug.
(In reply to comment #22) > Created attachment 103384 [details] > dmesg > > Test still failed on latest > -nightly(af1aaba219fdd90ca1b30f9b8d8d19352224f170) on BYT > root@x-bytm02:~# cd /GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests/ > root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_exec_big > IGT-Version: 1.7-g70e6ed9 (x86_64) (Linux: > 3.16.0-rc6_drm-intel-nightly_af1aab_20140724+ x86_64) > Test assertion failure function exec, file gem_exec_big.c:97: > Failed assertion: tmp == gem_reloc[0].presumed_offset > error: 0 == 8908800 Reported new bug for this error (Bug 81728), close this one.
Closing verified+fixed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.