Bug 46844 - [uxa] assertion failure: drm_intel_gem_bo_purge_vma_cache: bo_gem->map_count == 0
Summary: [uxa] assertion failure: drm_intel_gem_bo_purge_vma_cache: bo_gem->map_count ...
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-03-01 12:09 UTC by Modestas Vainius
Modified: 2012-11-20 18:26 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Valgrind traces (211.61 KB, text/plain)
2012-03-17 07:46 UTC, Modestas Vainius
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Modestas Vainius 2012-03-01 12:09:21 UTC
Hello,

since upgrade to 2.18.0 (from 2.17.0), I have been experiencing random X crashes. I can't tell what exactly triggers X server crash but it always happens pretty soon after KDE startup. I'm using libdrm 2.4.30 though the crash is reproducible with libdrm 2.4.31 as well.

Below you will find a backtrace and lspci output. Nothing interesting gets written to Xorg.0.log. If you need more information, feel free to ask.

P.S. I'm aware that crash is in libdrm. However, 2.17.0 used to work fine with the same libdrm. So I'm not sure where the actual bug is.

(gdb) bt
#0  0x00007f8b75312475 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f8b753156f0 in *__GI_abort () at abort.c:92
#2  0x00007f8b7530b621 in *__GI___assert_fail (assertion=0x7f8b730d2f61 "bo_gem->map_count == 0", 
    file=<optimized out>, line=960, function=0x7f8b730d3160 "drm_intel_gem_bo_purge_vma_cache") at assert.c:81
#3  0x00007f8b730ceb6f in drm_intel_gem_bo_purge_vma_cache (bufmgr_gem=0x7f8b785c3310)
    at intel_bufmgr_gem.c:960
#4  0x00007f8b730d10d3 in drm_intel_gem_bo_map_gtt (bo=0x7f8b7c93b4e0) at intel_bufmgr_gem.c:1160
#5  0x00007f8b732e79a4 in intel_uxa_pixmap_put_image (pixmap=<optimized out>, 
    src=0x7f8b7c93d450 '"' <repeats 122 times>, src_pitch=124, x=0, y=<optimized out>, w=122, h=19)
    at ../../src/intel_uxa.c:772
#6  0x00007f8b732e96c7 in intel_uxa_put_image (pixmap=0x7f8b7c93d3c0, x=0, y=0, w=<optimized out>, h=19, 
    src=0x7f8b7c93d450 '"' <repeats 122 times>, src_pitch=124) at ../../src/intel_uxa.c:837
#7  0x00007f8b733108a2 in uxa_picture_from_pixman_image (format=PIXMAN_a8, image=0x7f8b7c938680, 
    screen=0x7f8b785ca580) at ../../uxa/uxa-render.c:534
#8  uxa_trapezoids (op=3 '\003', src=0x7f8b7a4038d0, dst=0x7f8b7c6d99d0, maskFormat=<optimized out>, 
    xSrc=1017, ySrc=160, ntrap=<optimized out>, traps=<optimized out>) at ../../uxa/uxa-render.c:2001
#9  0x00007f8b773bbba1 in ProcRenderTrapezoids (client=0x7f8b7c616210) at ../../render/render.c:783
#10 0x00007f8b772fff81 in Dispatch () at ../../dix/dispatch.c:437
#11 0x00007f8b772ef1aa in main (argc=8, argv=<optimized out>, envp=<optimized out>) at ../../dix/main.c:287
(gdb) bt full
#0  0x00007f8b75312475 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
        pid = <optimized out>
        selftid = <optimized out>
#1  0x00007f8b753156f0 in *__GI_abort () at abort.c:92
        act = {__sigaction_handler = {sa_handler = 0x7f8b730d2f61, sa_sigaction = 0x7f8b730d2f61}, sa_mask = {
            __val = {140236944484584, 140734302203024, 960, 140734302203264, 140236943548550, 206158430232, 
              140734302203280, 140734302203056, 140236943447112, 206158430256, 140734302203304, 
              140237067225312, 263376, 4480357486596943714, 7959390389040865645, 140734302210458}}, 
          sa_flags = 1967294783, sa_restorer = 0x7f8b730d2f4e}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00007f8b7530b621 in *__GI___assert_fail (assertion=0x7f8b730d2f61 "bo_gem->map_count == 0", 
    file=<optimized out>, line=960, function=0x7f8b730d3160 "drm_intel_gem_bo_purge_vma_cache") at assert.c:81
        buf = 0x7f8b7c9388e0 "X: intel_bufmgr_gem.c:960: drm_intel_gem_bo_purge_vma_cache: Assertion `bo_gem->map_count == 0' failed.\n"
#3  0x00007f8b730ceb6f in drm_intel_gem_bo_purge_vma_cache (bufmgr_gem=0x7f8b785c3310)
    at intel_bufmgr_gem.c:960
        bo_gem = 0x7f8b785c38f8
        limit = 508
        __FUNCTION__ = "drm_intel_gem_bo_purge_vma_cache"
        __PRETTY_FUNCTION__ = "drm_intel_gem_bo_purge_vma_cache"
---Type <return> to continue, or q <return> to quit---
#4  0x00007f8b730d10d3 in drm_intel_gem_bo_map_gtt (bo=0x7f8b7c93b4e0) at intel_bufmgr_gem.c:1160
        bufmgr_gem = 0x7f8b785c3310
        bo_gem = 0x7f8b7c93b4e0
        set_domain = {handle = 2019125584, read_domains = 32651, write_domain = 1108815584}
        ret = <optimized out>
        __PRETTY_FUNCTION__ = "drm_intel_gem_bo_map_gtt"
#5  0x00007f8b732e79a4 in intel_uxa_pixmap_put_image (pixmap=<optimized out>, 
    src=0x7f8b7c93d450 '"' <repeats 122 times>, src_pitch=124, x=0, y=<optimized out>, w=122, h=19)
    at ../../src/intel_uxa.c:772
        priv = 0x7f8b7c93a560
        stride = 128
        cpp = <optimized out>
        ret = 0
#6  0x00007f8b732e96c7 in intel_uxa_put_image (pixmap=0x7f8b7c93d3c0, x=0, y=0, w=<optimized out>, h=19, 
    src=0x7f8b7c93d450 '"' <repeats 122 times>, src_pitch=124) at ../../src/intel_uxa.c:837
        intel = 0x3206
        tiling = 1983298042
        size = <optimized out>
        stride = 32651
        bo = <optimized out>
        screen = <optimized out>
        priv = 0x7f8b7c93a560
---Type <return> to continue, or q <return> to quit---
#7  0x00007f8b733108a2 in uxa_picture_from_pixman_image (format=PIXMAN_a8, image=0x7f8b7c938680, 
    screen=0x7f8b785ca580) at ../../uxa/uxa-render.c:534
        uxa_screen = 0x7f8b732e9420
        picture = 0x7f8b7c938100
        pixmap = <optimized out>
        width = 122
        height = 19
#8  uxa_trapezoids (op=3 '\003', src=0x7f8b7a4038d0, dst=0x7f8b7c6d99d0, maskFormat=<optimized out>, 
    xSrc=1017, ySrc=160, ntrap=<optimized out>, traps=<optimized out>) at ../../uxa/uxa-render.c:2001
        scratch = 0x0
        yDst = 160
        xRel = <optimized out>
        width = 122
        height = 19
        mask = <optimized out>
        yRel = <optimized out>
        xDst = 1017
        image = 0x7f8b7c938680
        format = PIXMAN_a8
        screen = 0x7f8b785ca580
        uxa_screen = 0x0
        bounds = {x1 = 1017, y1 = 160, x2 = 1139, y2 = 179}
---Type <return> to continue, or q <return> to quit---
        direct = <optimized out>
#9  0x00007f8b773bbba1 in ProcRenderTrapezoids (client=0x7f8b7c616210) at ../../render/render.c:783
        rc = <optimized out>
        ntraps = <optimized out>
        pSrc = 0x7f8b7a4038d0
        pDst = 0x7f8b7c6d99d0
        pFormat = 0x7f8b7a120f28
        stuff = 0x7f8b7a375e04
#10 0x00007f8b772fff81 in Dispatch () at ../../dix/dispatch.c:437
        clientReady = 0x7f8b7a34b7c0
        result = <optimized out>
        client = 0x7f8b7c616210
        nready = 0
        icheck = 0x7f8b776b4ad0
        start_tick = 400
#11 0x00007f8b772ef1aa in main (argc=8, argv=<optimized out>, envp=<optimized out>) at ../../dix/main.c:287
        i = <optimized out>
        alwaysCheckForInput = {0, 1}
(gdb)

00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 12) (prog-if 00 [VGA controller])
        Subsystem: Giga-byte Technology Device d000
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 48
        Region 0: Memory at fb400000 (64-bit, non-prefetchable) [size=4M]
        Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
        Region 4: I/O ports at ff00 [size=8]
        Expansion ROM at <unassigned> [disabled]
        Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee0f00c  Data: 41a1
        Capabilities: [d0] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [a4] PCI Advanced Features
                AFCap: TP+ FLR+
                AFCtrl: FLR-
                AFStatus: TP-
        Kernel driver in use: i915
Comment 1 Modestas Vainius 2012-03-17 01:54:33 UTC
Hello,

how can I help in solving this issue? Would bisecting the commit which introduces the crash help?
Comment 2 Chris Wilson 2012-03-17 02:34:32 UTC
No, I've searched through the code for a logically inconsistency and the assert looks valid. Other than finding the path that leads to this error, valgrind might be our best hope.
Comment 3 Modestas Vainius 2012-03-17 07:45:25 UTC
That's what I came up with. Weirdly enough, valgrind errors are not always reproducible even if X crashes.

linux 3.2 / libdrm 2.4.31 / intel driver 2.18.0

Hope it helps because 2.18.0 is basically unusable for me...
Comment 4 Modestas Vainius 2012-03-17 07:46:07 UTC
Created attachment 58606 [details]
Valgrind traces
Comment 5 Chris Wilson 2012-03-17 12:03:55 UTC
If it's that bad, you can always switch to SNA. Stable and faster, not a bad combination :-p

Thanks for the traces, looks like we have a double free.
Comment 6 Chris Wilson 2012-03-17 12:11:58 UTC
Also looks like you can prevent the crash using

Section "Device"
  Driver "intel"
  Option "BufferCache" "False"
EndSection
Comment 7 Chris Wilson 2012-03-17 12:59:17 UTC
Also can you try

diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c
index 3c91090..c6ab51e 100644
--- a/intel/intel_bufmgr_gem.c
+++ b/intel/intel_bufmgr_gem.c
@@ -907,6 +907,8 @@ drm_intel_gem_bo_free(drm_intel_bo *bo)
        struct drm_gem_close close;
        int ret;
 
+       assert(atomic_read(&bo_gem->refcount) == 0);
+
        DRMLISTDEL(&bo_gem->vma_list);
        if (bo_gem->mem_virtual) {
                VG(VALGRIND_FREELIKE_BLOCK(bo_gem->mem_virtual, 0));

And see if we hit that assertion.
Comment 8 Modestas Vainius 2012-03-18 01:50:35 UTC
I have no idea what changed but X no longer crashes on startup with the same configuration. However, after running such non-crashing X for some 14 hours, I have started seeing display anomalies like flickering, screen not fully redrawing etc. but no crashes. So clearly there is still something wrong but the bug just manifests itself in the different way...

Anyway, right before X stopped crashing (without xorg.conf file), I tried the xorg.conf config file you suggested. All times I tried X ended up in the infinite loop (see below)...

I will keep trying to reproduce the crash though.

(gdb) bt
#0  0x00007f287cdc0a9a in drm_intel_gem_bo_purge_vma_cache (bufmgr_gem=0x7f28830d3860)
    at intel_bufmgr_gem.c:960
#1  0x00007f287cdc30d3 in drm_intel_gem_bo_map_gtt (bo=0x7f28854edbe0) at intel_bufmgr_gem.c:1160
#2  0x00007f287cfd99a4 in intel_uxa_pixmap_put_image (pixmap=<optimized out>, src=0x7f288546b240 "", 
    src_pitch=8, x=0, y=<optimized out>, w=8, h=8) at ../../src/intel_uxa.c:772
#3  0x00007f287cfdb6c7 in intel_uxa_put_image (pixmap=0x7f2885366c30, x=0, y=0, w=<optimized out>, h=8, 
    src=0x7f288546b240 "", src_pitch=8) at ../../src/intel_uxa.c:837
#4  0x00007f287d0028a2 in uxa_picture_from_pixman_image (format=PIXMAN_a8, image=0x7f28854edac0, 
    screen=0x7f28830dab80) at ../../uxa/uxa-render.c:534
#5  uxa_trapezoids (op=8 '\b', src=0x7f288539f0d0, dst=0x7f28854e7c10, maskFormat=<optimized out>, xSrc=7, 
    ySrc=3, ntrap=<optimized out>, traps=<optimized out>) at ../../uxa/uxa-render.c:2001
#6  0x00007f28810aebf1 in ProcRenderTrapezoids (client=0x7f288521a2c0) at ../../render/render.c:783
#7  0x00007f2880ff2f81 in Dispatch () at ../../dix/dispatch.c:439
#8  0x00007f2880fe21aa in main (argc=8, argv=<optimized out>, envp=<optimized out>) at ../../dix/main.c:287
(gdb) display bufmgr_gem->vma_count
1: bufmgr_gem->vma_count = 509
(gdb) display limit
2: limit = 508
(gdb) n
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
968                             bo_gem->mem_virtual = NULL;
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
954             if (limit < 0)
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
957             while (bufmgr_gem->vma_count > limit) {
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
960                     bo_gem = DRMLISTENTRY(drm_intel_bo_gem,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
968                             bo_gem->mem_virtual = NULL;
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
954             if (limit < 0)
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
957             while (bufmgr_gem->vma_count > limit) {
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
960                     bo_gem = DRMLISTENTRY(drm_intel_bo_gem,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
968                             bo_gem->mem_virtual = NULL;
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
954             if (limit < 0)
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
957             while (bufmgr_gem->vma_count > limit) {
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
960                     bo_gem = DRMLISTENTRY(drm_intel_bo_gem,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
968                             bo_gem->mem_virtual = NULL;
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
954             if (limit < 0)
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
957             while (bufmgr_gem->vma_count > limit) {
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
960                     bo_gem = DRMLISTENTRY(drm_intel_bo_gem,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
961                                           bufmgr_gem->vma_cache.next,
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) 
963                     assert(bo_gem->map_count == 0);
1: bufmgr_gem->vma_count = 509
2: limit = 508
(gdb) finish
Run till exit from #0  drm_intel_gem_bo_purge_vma_cache (bufmgr_gem=0x7f28830d3860) at intel_bufmgr_gem.c:961
Comment 9 Chris Wilson 2012-03-18 08:56:57 UTC
Another impossible condition. If the infinite loop is easy to reproduce, can you try it under valgrind? As we reduce the number of pieces of code holding onto references, we might be able to see where the error occurs.
Comment 10 Modestas Vainius 2012-03-18 13:52:29 UTC
Nope, unfortunately, I can't reproduce the infinite loop either anymore... sigh, I really don't understand what happened.

What I found out is that I get display distortions once OpenGL screensaver starts when kwin (KDE window manager) compositing is on. Stopping and resuming compositing cures the problem. I ran this screensaver issue under valgrind and nothing interesting popped up :/ OpenGL screensaver causes no trouble with 2.17.0 though.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.