Bug 78937 - [BDW Regression]igt/gem_render_linear_blits fails
Summary: [BDW Regression]igt/gem_render_linear_blits fails
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: highest normal
Assignee: Ben Widawsky
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 78938
  Show dependency treegraph
 
Reported: 2014-05-20 05:08 UTC by Guo Jinxian
Modified: 2016-10-12 11:00 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (89.91 KB, text/plain)
2014-05-20 05:08 UTC, Guo Jinxian
no flags Details
dmesg (79.53 KB, text/plain)
2014-05-29 02:49 UTC, Guo Jinxian
no flags Details
dmesg (100.92 KB, text/plain)
2014-05-30 04:53 UTC, Guo Jinxian
no flags Details
dmesg (77.04 KB, text/plain)
2014-06-03 08:29 UTC, Guo Jinxian
no flags Details

Description Guo Jinxian 2014-05-20 05:08:47 UTC
Created attachment 99379 [details]
dmesg

==System Environment==
--------------------------
Regression: Yes. 
Good commit on -next-queued(b7c0d9df97c10ec5693a838df2fd53058f8e9e96)

Non-working platforms: BDW

==kernel==
--------------------------
-nightly: f79ba79cf037eea9ee757ad37730b00f43d5ef80 (fails)
-queued: d3b448d9917a3d6531e499d88bfb13ea5e31e4ad (fails)
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Fri May 16 18:59:00 2014 +0100

    drm/i915: Only unpin the default ctx object if it exists

    Since commit 691e6415c891b8b2b082a120b896b443531c4d45
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Wed Apr 9 09:07:36 2014 +0100

        drm/i915: Always use kref tracking for all contexts.

    we have contexts everywhere, and so we must be careful to distinguish
    fake contexts, which do not have an associated bo, and real ones, which
    do. In particular, we now need to be careful not to dereference NULL
    pointers.

    This is one such example, as the commit highlighted above failed to move
    the unpinning of the default ctx object into the real-context-only
    branch.

    Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78792
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Cc: Ben Widawsky <benjamin.widawsky@intel.com>
    Cc: Mika Kuoppala <mika.kuoppala@intel.com>
    Cc: Jani Nikula <jani.nikula@intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

-fixes: e95a2f7509f5219177d6821a0a8754f93892ca56 (works)
    Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Date:   Thu May 8 15:09:19 2014 +0300

    drm/i915: Increase WM memory latency values on SNB

    On SNB the BIOS provided WM memory latency values seem insufficient to
    handle high resolution displays.

    In this particular case the display mode was a 2560x1440@60Hz, which
    makes the pixel clock 241.5 MHz. It was empirically found that a memory
    latency value if 1.2 usec is enough to avoid underruns, whereas the BIOS
    provided value of 0.7 usec was clearly too low. Incidentally 1.2 usec
    is what the typical BIOS provided values are on IVB systems.

    Increase the WM memory latency values to at least 1.2 usec on SNB.
    Hopefully this won't have a significant effect on power consumption.

    v2: Increase the latency values regardless of the pixel clock

    Cc: Robert N <crshman@gmail.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70254
    Tested-by: Robert Navarro <crshman@gmail.com>
    Tested-by: Vitaly Minko <vitaly.minko@gmail.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>

==Bug detailed description==
-----------------------------
./gem_render_linear_blits fails

Output:
./gem_render_linear_blits
IGT-Version: 1.6-gd71add5 (x86_64) (Linux: 3.15.0-rc3_drm-intel-nightly_f79ba7_20140519+ x86_64)
not enough RAM to run test, reducing buffer count
Verifying initialisation...
Cyclic blits, forward...
Test assertion failure function check_bo, file gem_render_linear_blits.c:79:
Last errno: 0, Success
Failed assertion: linear[i] == val
Expected 0x00000001, found 0x00040001 at offset 0x00000004


==Reproduce steps==
---------------------------- 
1. ./gem_render_linear_blits
Comment 1 Chris Wilson 2014-05-20 06:25:54 UTC
Please bisect.
Comment 2 Guo Jinxian 2014-05-20 08:41:42 UTC
Bisected was blocked by Bug 78274.
Comment 3 Mika Kuoppala 2014-05-20 15:50:41 UTC
With rendercopy in here also, smells alot like:

https://bugs.freedesktop.org/show_bug.cgi?id=78891
Comment 4 Ben Widawsky 2014-05-28 23:09:44 UTC
Please test:
http://patchwork.freedesktop.org/patch/26784/
Comment 5 Guo Jinxian 2014-05-29 02:49:46 UTC
Created attachment 100070 [details]
dmesg

(In reply to comment #4)
> Please test:
> http://patchwork.freedesktop.org/patch/26784/

The bug still able to reproduce with this patch.

Output:
./gem_render_linear_blits
IGT-Version: 1.6-ge4ba3b7 (x86_64) (Linux: 3.15.0-rc3_prts_92f645_20140529 x86_64)
not enough RAM to run test, reducing buffer count
Verifying initialisation...
Cyclic blits, forward...
Test assertion failure function check_bo, file gem_render_linear_blits.c:79:
Last errno: 0, Success
Failed assertion: linear[i] == val
Expected 0x00000001, found 0x00040001 at offset 0x00000004
Comment 6 Ben Widawsky 2014-05-29 05:48:52 UTC
Ok, then please do the bisect as Chris requested.
Comment 7 Ben Widawsky 2014-05-29 21:21:27 UTC
I can't reproduce this on an E2 with the latest BIOS. Guo, can you confirm what platform you're using, while doing the bisect? Can you test the same stepping?
Comment 8 Ben Widawsky 2014-05-29 23:33:12 UTC
I just confirmed the same harddrive reproduces the bug on E0, but not on E2.
Comment 9 Guo Jinxian 2014-05-30 04:48:13 UTC
(In reply to comment #7)
> I can't reproduce this on an E2 with the latest BIOS. Guo, can you confirm
> what platform you're using, while doing the bisect? Can you test the same
> stepping?

I am using E0, the Stepping is 4
Comment 10 Guo Jinxian 2014-05-30 04:49:18 UTC
(In reply to comment #8)
> I just confirmed the same harddrive reproduces the bug on E0, but not on E2.

Yes, I test it on E0. We have not E2 device.
Comment 11 Guo Jinxian 2014-05-30 04:53:37 UTC
Created attachment 100138 [details]
dmesg

(In reply to comment #6)
> Ok, then please do the bisect as Chris requested.
I found the test unable exit on some commit during bisecting. and I found the first bad commit of unable exit was 78325f2d270897c9ee0887125b7abb963eb8efea

commit 78325f2d270897c9ee0887125b7abb963eb8efea
Author:     Ben Widawsky <benjamin.widawsky@intel.com>
AuthorDate: Tue Apr 29 14:52:29 2014 -0700
Commit:     Daniel Vetter <daniel.vetter@ffwll.ch>
CommitDate: Mon May 5 10:56:53 2014 +0200

    drm/i915: Virtualize the ringbuffer signal func

    This abstraction again is in preparation for gen8. Gen8 will bring new
    semantics for doing this operation.

    While here, make the writes of MI_NOOPs explicit for non-existent rings.
    This should have been implicit before.

    NOTE: This is going to be removed in a few patches.

    Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


Output:
time ./gem_render_linear_blits                     
IGT-Version: 1.6-g532b7e6 (x86_64) (Linux: 3.15.0-rc3_drm-intel-next-queued_78325f_20140513+ x86_64)
not enough RAM to run test, reducing buffer count
Verifying initialisation...
Cyclic blits, forward...
Comment 12 Guo Jinxian 2014-05-30 04:55:37 UTC
The commit unable to revert.
Comment 13 Chris Wilson 2014-05-30 07:07:38 UTC
You should test with i915.enable_rc6=0.
Comment 14 Ben Widawsky 2014-05-30 20:46:16 UTC
Small update, I received another E2 platform, with the same BIOS, and can hit the bug. So I believe the platform I was running on is just impervious to the bug.

Guo, try Chris' suggestion, and please double check the bisect is correct - it looks suspicious.
Comment 15 Ben Widawsky 2014-05-31 01:11:41 UTC
I finally found a platform that can reliably reproduce, and my bisect lead to the more likely:

commit 229b0489aa75a8c51d2f2e124329d3ac326f326d
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Wed May 14 17:02:17 2014 +0300

    drm/i915: add null render states for gen6, gen7 and gen8
    

I am currently working on reviewing the render state in IGT. Chris, extra eyes on that state setup would be nice if you can find the time.
Comment 16 Ben Widawsky 2014-05-31 01:14:11 UTC
Oh, and rc6 doesn't effect this bug (for me)
Comment 17 Guo Jinxian 2014-06-03 07:47:05 UTC
(In reply to comment #13)
> You should test with i915.enable_rc6=0.

Disable rc6 on latest -nightly(455a8fc4304af51a913e33763b72dd2849c11d0c), This bug still able to reproduce.

Output:
./gem_render_linear_blits
IGT-Version: 1.6-g532b7e6 (x86_64) (Linux: 3.15.0-rc7_drm-intel-nightly_455a8f_20140603+ x86_64)
not enough RAM to run test, reducing buffer count
Verifying initialisation...
Cyclic blits, forward...
Test assertion failure function check_bo, file gem_render_linear_blits.c:79:
Last errno: 0, Success
Failed assertion: linear[i] == val
Expected 0x00000001, found 0x00040001 at offset 0x00000004
Comment 18 Guo Jinxian 2014-06-03 08:29:53 UTC
Created attachment 100343 [details]
dmesg

(In reply to comment #15)
> I finally found a platform that can reliably reproduce, and my bisect lead
> to the more likely:
> 
> commit 229b0489aa75a8c51d2f2e124329d3ac326f326d
> Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Date:   Wed May 14 17:02:17 2014 +0300
> 
>     drm/i915: add null render states for gen6, gen7 and gen8
>     
> 
> I am currently working on reviewing the render state in IGT. Chris, extra
> eyes on that state setup would be nice if you can find the time.

I revert this commit and retest on my device. the result was pass.
Comment 19 Ben Widawsky 2014-06-03 21:55:43 UTC
Please test: IGT patch http://patchwork.freedesktop.org/patch/27088/
Comment 20 Guo Jinxian 2014-06-04 01:35:14 UTC
(In reply to comment #19)
> Please test: IGT patch http://patchwork.freedesktop.org/patch/27088/

Test on latest -nightly(455a8fc4304af51a913e33763b72dd2849c11d0c) use igt with this patch. the result was pass.

Output:
./gem_render_linear_blits
IGT-Version: 1.6-g3c70e6a (x86_64) (Linux: 3.15.0-rc7_drm-intel-nightly_455a8f_20140603+ x86_64)
not enough RAM to run test, reducing buffer count
Verifying initialisation...
Cyclic blits, forward...
Cyclic blits, backward...
Random blits...
Comment 21 Guo Jinxian 2014-06-10 03:49:18 UTC
The result is pass on both E0 and E2 on latest -next-queued(e4964a6e664b4c338b5ab1f1820b0477bec68396)
Comment 22 Jari Tahvanainen 2016-10-12 11:00:05 UTC
Closing verified+fixed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.