Bug 89903 - [IVB SNA] sna_dri2_reuse_buffer:370 assertion 'kgem_bo_flink(&to_sna_from_drawable(draw)->kgem, get_private(buffer)->bo) == buffer->name' failed
Summary: [IVB SNA] sna_dri2_reuse_buffer:370 assertion 'kgem_bo_flink(&to_sna_from_dra...
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-04 13:20 UTC by Chris Bainbridge
Modified: 2015-04-05 16:37 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
xorg.log.gz (1.47 MB, application/octet-stream)
2015-04-04 13:22 UTC, Chris Bainbridge
no flags Details
xorg.log.226a58b.gz (1.23 MB, application/octet-stream)
2015-04-04 23:29 UTC, Chris Bainbridge
no flags Details

Description Chris Bainbridge 2015-04-04 13:20:57 UTC
intel driver git ea545e0
IvyBridge Macbook 13

Boot Macbook with internal screen eDP1 and external monitor on DP2
Start Gnome
Do `xrandr --output DP2 --rotate left`
crash (or shortly after when clicking on Gnome menubar)

Xorg log shows:

Fatal server error:
[   732.912] (EE) sna_dri2_reuse_buffer:370 assertion 'kgem_bo_flink(&to_sna_from_drawable(draw)->kgem, get_private(buffer)->bo) == buffer->name' failed

This does not seem to happen with multiple external monitors, just the internal screen and external monitor.
Comment 1 Chris Bainbridge 2015-04-04 13:22:46 UTC
Created attachment 114862 [details]
xorg.log.gz
Comment 2 Chris Wilson 2015-04-04 19:12:56 UTC
Okay, I can see where the confusion sets in. You have a sw-cursor which is causing rendering during the process of swapping over the buffer names - and so the names do not match up afterwards.

Hmm. I've added some more debug to get to the bottom of the sw cursor issue and whilst you capture that, I'll try and fix the confusion with the sw cursor getting in the way of the buffer exchange.

commit 108a09e3db6c9cf86bf0b4eb8574ccc22555edb2
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Apr 4 20:11:24 2015 +0100

    sna: Add DBG for why we fallback to sw cursor
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=89903
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 3 Chris Wilson 2015-04-04 20:02:59 UTC
This should fix the assertion:

commit 226a58bc592d4ed305b7ad0e460f1ee2548e0ddf
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Apr 4 20:58:24 2015 +0100

    sna/dri2: Prevent the sw cursor from copyig to a buffer as we discard it
    
    During swapbuffers, the sw cursor tries to write to the old buffer.
    Ordinary this is not an issue as we are discarding it, but under
    TearFree that write causes us to instantiate the shadow buffer with a
    possible recursion into set_bo and mayhem.
    
    Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
    References: https://bugs.freedesktop.org/show_bug.cgi?id=89903
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

and maybe the other odd issue. However, I still want to dig into why you are hitting the sw cursor fallback.
Comment 4 Chris Bainbridge 2015-04-04 23:29:05 UTC
Created attachment 114872 [details]
xorg.log.226a58b.gz

Still crashes on rotate. Log attached.
Comment 5 Chris Wilson 2015-04-05 08:58:08 UTC
Found the culprit for the swcursor fallback:

commit 209d120dbf9b32d3b96a0f857e4f658f6a554c02
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sun Apr 5 09:56:08 2015 +0100

    sna: Initialise hwcursor to true
    
    So we don't disable the hwcursor when we have a rotation set but not all
    pipes activated.
    
    Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
    References: https://bugs.freedesktop.org/show_bug.cgi?id=89903
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Now onto the more scary assertion.
Comment 6 Chris Wilson 2015-04-05 09:16:33 UTC
(In reply to Chris Wilson from comment #3)
> This should fix the assertion:
> 
> commit 226a58bc592d4ed305b7ad0e460f1ee2548e0ddf
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Sat Apr 4 20:58:24 2015 +0100

Hmm, it did not work.
Comment 7 Chris Wilson 2015-04-05 10:01:56 UTC
Ok, I think I got it right this time:

commit 7cf670228eec058a97f6450df46a1a47cb080583
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Apr 4 20:58:24 2015 +0100

    sna/dri2: Prevent the sw cursor from copyig to a buffer as we discard it
    
    During swapbuffers, the sw cursor tries to write to the old buffer.
    Ordinary this is not an issue as we are discarding it, but under
    TearFree that write causes us to instantiate the shadow buffer with a
    possible recursion into set_bo and mayhem.
    
    v2:
    
    commit 226a58bc592d4ed305b7ad0e460f1ee2548e0ddf
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Sat Apr 4 20:58:24 2015 +0100
    
        sna/dri2: Prevent the sw cursor from copyig to a buffer as we discard it
    
    Tried to fix it by disabling SourceValidate. However, it a direct hook
    into the Damage code by miSprite that triggers the copy. Since there
    appears to be no way to intervene, we just mark that copy as internal
    and ignore it.
    
    Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
    References: https://bugs.freedesktop.org/show_bug.cgi?id=89903
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 8 Chris Bainbridge 2015-04-05 10:54:33 UTC
Ok that seems to fix the original test case but there may still be an issue. With 3 monitors connected I did:

while true; do xrandr --output HDMI1 --rotate left; sleep 1; xrandr --output HDMI1 --rotate normal; sleep 1; done

Which quickly resulted in:

Program received signal SIGSEGV, Segmentation fault.
0x00007fa21d83ffd8 in sna_crtc_disable_cursor (crtc=crtc@entry=0x7fa2247108f0, sna=<optimized out>) at sna_display.c:5220
5220            crtc->cursor->ref--;
(gdb) bt
#0  0x00007fa21d83ffd8 in sna_crtc_disable_cursor (crtc=crtc@entry=0x7fa2247108f0, sna=<optimized out>) at sna_display.c:5220
#1  0x00007fa21d845508 in sna_crtc_disable_cursor (sna=<optimized out>, crtc=0x7fa2247108f0) at sna_display.c:2496
#2  __sna_crtc_set_mode (crtc=crtc@entry=0x7fa224710a10) at sna_display.c:2498
#3  0x00007fa21d846140 in sna_crtc_set_mode_major (crtc=0x7fa224710a10, mode=0x7fff025cf6a0, rotation=<optimized out>, x=<optimized out>, y=<optimized out>) at sna_display.c:2557
#4  0x00007fa2237ef3cb in xf86CrtcSetModeTransform (crtc=crtc@entry=0x7fa224710a10, mode=mode@entry=0x7fff025cf6a0, rotation=rotation@entry=2, transform=transform@entry=0x0, x=x@entry=0, y=y@entry=0)
    at ../../../../hw/xfree86/modes/xf86Crtc.c:297
#5  0x00007fa2237f8bee in xf86RandR12CrtcSet (pScreen=0x7fa224715440, randr_crtc=0x7fa224718530, randr_mode=0x7fa22471ca70, x=0, y=0, rotation=<optimized out>, num_randr_outputs=1, randr_outputs=0x7fa224ded520)
    at ../../../../hw/xfree86/modes/xf86RandR12.c:1207
#6  0x00007fa223839ecd in RRCrtcSet (crtc=0x7fa224718530, mode=0xc01c64a3, x=39645312, y=558836711, rotation=49152, numOutputs=65536, outputs=0x7fa224ded520) at ../../randr/rrcrtc.c:574
#7  0x00007fa22383b354 in ProcRRSetCrtcConfig (client=0x7fa224e283f0) at ../../randr/rrcrtc.c:1173
#8  0x00007fa2237773f7 in Dispatch () at ../../dix/dispatch.c:432
#9  0x00007fa22377b596 in dix_main (argc=10, argv=0x7fff025cfbc8, envp=<optimized out>) at ../../dix/main.c:296
#10 0x00007fa221435b45 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#11 0x00007fa22376590e in _start ()

Could be a completely different issue but backtrace mentions disabling the cursor.
Comment 9 Chris Wilson 2015-04-05 14:51:02 UTC
I couldn't spot any obvious reason for a stale pointer (all the unrefs cleared the pointer). So I had to look for subtle reasons like being interrupted by a signal at just the wrong moment:

commit be0fc3ce20fda064d68f38e24717552222d1fd74
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sun Apr 5 15:46:27 2015 +0100

    sna: Block signals while releasing cursor under modeset
    
    Otherwise we may process a SIGIO moving the cursor away from the CRTC
    causing us to remove the cursor and then process a stale pointer inside
    the modeset after the signal is complete.
    
    Reported-by: Chris Bainbrigde <chris.bainbridge@gmail.com>
    References: https://bugs.freedesktop.org/show_bug.cgi?id=89903
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 10 Chris Bainbridge 2015-04-05 16:37:57 UTC
Confirming crash is fixed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.