Created attachment 99379 [details] dmesg ==System Environment== -------------------------- Regression: Yes. Good commit on -next-queued(b7c0d9df97c10ec5693a838df2fd53058f8e9e96) Non-working platforms: BDW ==kernel== -------------------------- -nightly: f79ba79cf037eea9ee757ad37730b00f43d5ef80 (fails) -queued: d3b448d9917a3d6531e499d88bfb13ea5e31e4ad (fails) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri May 16 18:59:00 2014 +0100 drm/i915: Only unpin the default ctx object if it exists Since commit 691e6415c891b8b2b082a120b896b443531c4d45 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Apr 9 09:07:36 2014 +0100 drm/i915: Always use kref tracking for all contexts. we have contexts everywhere, and so we must be careful to distinguish fake contexts, which do not have an associated bo, and real ones, which do. In particular, we now need to be careful not to dereference NULL pointers. This is one such example, as the commit highlighted above failed to move the unpinning of the default ctx object into the real-context-only branch. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78792 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> -fixes: e95a2f7509f5219177d6821a0a8754f93892ca56 (works) Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Thu May 8 15:09:19 2014 +0300 drm/i915: Increase WM memory latency values on SNB On SNB the BIOS provided WM memory latency values seem insufficient to handle high resolution displays. In this particular case the display mode was a 2560x1440@60Hz, which makes the pixel clock 241.5 MHz. It was empirically found that a memory latency value if 1.2 usec is enough to avoid underruns, whereas the BIOS provided value of 0.7 usec was clearly too low. Incidentally 1.2 usec is what the typical BIOS provided values are on IVB systems. Increase the WM memory latency values to at least 1.2 usec on SNB. Hopefully this won't have a significant effect on power consumption. v2: Increase the latency values regardless of the pixel clock Cc: Robert N <crshman@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70254 Tested-by: Robert Navarro <crshman@gmail.com> Tested-by: Vitaly Minko <vitaly.minko@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> ==Bug detailed description== ----------------------------- ./gem_render_linear_blits fails Output: ./gem_render_linear_blits IGT-Version: 1.6-gd71add5 (x86_64) (Linux: 3.15.0-rc3_drm-intel-nightly_f79ba7_20140519+ x86_64) not enough RAM to run test, reducing buffer count Verifying initialisation... Cyclic blits, forward... Test assertion failure function check_bo, file gem_render_linear_blits.c:79: Last errno: 0, Success Failed assertion: linear[i] == val Expected 0x00000001, found 0x00040001 at offset 0x00000004 ==Reproduce steps== ---------------------------- 1. ./gem_render_linear_blits
Please bisect.
Bisected was blocked by Bug 78274.
With rendercopy in here also, smells alot like: https://bugs.freedesktop.org/show_bug.cgi?id=78891
Please test: http://patchwork.freedesktop.org/patch/26784/
Created attachment 100070 [details] dmesg (In reply to comment #4) > Please test: > http://patchwork.freedesktop.org/patch/26784/ The bug still able to reproduce with this patch. Output: ./gem_render_linear_blits IGT-Version: 1.6-ge4ba3b7 (x86_64) (Linux: 3.15.0-rc3_prts_92f645_20140529 x86_64) not enough RAM to run test, reducing buffer count Verifying initialisation... Cyclic blits, forward... Test assertion failure function check_bo, file gem_render_linear_blits.c:79: Last errno: 0, Success Failed assertion: linear[i] == val Expected 0x00000001, found 0x00040001 at offset 0x00000004
Ok, then please do the bisect as Chris requested.
I can't reproduce this on an E2 with the latest BIOS. Guo, can you confirm what platform you're using, while doing the bisect? Can you test the same stepping?
I just confirmed the same harddrive reproduces the bug on E0, but not on E2.
(In reply to comment #7) > I can't reproduce this on an E2 with the latest BIOS. Guo, can you confirm > what platform you're using, while doing the bisect? Can you test the same > stepping? I am using E0, the Stepping is 4
(In reply to comment #8) > I just confirmed the same harddrive reproduces the bug on E0, but not on E2. Yes, I test it on E0. We have not E2 device.
Created attachment 100138 [details] dmesg (In reply to comment #6) > Ok, then please do the bisect as Chris requested. I found the test unable exit on some commit during bisecting. and I found the first bad commit of unable exit was 78325f2d270897c9ee0887125b7abb963eb8efea commit 78325f2d270897c9ee0887125b7abb963eb8efea Author: Ben Widawsky <benjamin.widawsky@intel.com> AuthorDate: Tue Apr 29 14:52:29 2014 -0700 Commit: Daniel Vetter <daniel.vetter@ffwll.ch> CommitDate: Mon May 5 10:56:53 2014 +0200 drm/i915: Virtualize the ringbuffer signal func This abstraction again is in preparation for gen8. Gen8 will bring new semantics for doing this operation. While here, make the writes of MI_NOOPs explicit for non-existent rings. This should have been implicit before. NOTE: This is going to be removed in a few patches. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Output: time ./gem_render_linear_blits IGT-Version: 1.6-g532b7e6 (x86_64) (Linux: 3.15.0-rc3_drm-intel-next-queued_78325f_20140513+ x86_64) not enough RAM to run test, reducing buffer count Verifying initialisation... Cyclic blits, forward...
The commit unable to revert.
You should test with i915.enable_rc6=0.
Small update, I received another E2 platform, with the same BIOS, and can hit the bug. So I believe the platform I was running on is just impervious to the bug. Guo, try Chris' suggestion, and please double check the bisect is correct - it looks suspicious.
I finally found a platform that can reliably reproduce, and my bisect lead to the more likely: commit 229b0489aa75a8c51d2f2e124329d3ac326f326d Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Wed May 14 17:02:17 2014 +0300 drm/i915: add null render states for gen6, gen7 and gen8 I am currently working on reviewing the render state in IGT. Chris, extra eyes on that state setup would be nice if you can find the time.
Oh, and rc6 doesn't effect this bug (for me)
(In reply to comment #13) > You should test with i915.enable_rc6=0. Disable rc6 on latest -nightly(455a8fc4304af51a913e33763b72dd2849c11d0c), This bug still able to reproduce. Output: ./gem_render_linear_blits IGT-Version: 1.6-g532b7e6 (x86_64) (Linux: 3.15.0-rc7_drm-intel-nightly_455a8f_20140603+ x86_64) not enough RAM to run test, reducing buffer count Verifying initialisation... Cyclic blits, forward... Test assertion failure function check_bo, file gem_render_linear_blits.c:79: Last errno: 0, Success Failed assertion: linear[i] == val Expected 0x00000001, found 0x00040001 at offset 0x00000004
Created attachment 100343 [details] dmesg (In reply to comment #15) > I finally found a platform that can reliably reproduce, and my bisect lead > to the more likely: > > commit 229b0489aa75a8c51d2f2e124329d3ac326f326d > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Date: Wed May 14 17:02:17 2014 +0300 > > drm/i915: add null render states for gen6, gen7 and gen8 > > > I am currently working on reviewing the render state in IGT. Chris, extra > eyes on that state setup would be nice if you can find the time. I revert this commit and retest on my device. the result was pass.
Please test: IGT patch http://patchwork.freedesktop.org/patch/27088/
(In reply to comment #19) > Please test: IGT patch http://patchwork.freedesktop.org/patch/27088/ Test on latest -nightly(455a8fc4304af51a913e33763b72dd2849c11d0c) use igt with this patch. the result was pass. Output: ./gem_render_linear_blits IGT-Version: 1.6-g3c70e6a (x86_64) (Linux: 3.15.0-rc7_drm-intel-nightly_455a8f_20140603+ x86_64) not enough RAM to run test, reducing buffer count Verifying initialisation... Cyclic blits, forward... Cyclic blits, backward... Random blits...
The result is pass on both E0 and E2 on latest -next-queued(e4964a6e664b4c338b5ab1f1820b0477bec68396)
Closing verified+fixed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.