Hello, since upgrade to 2.18.0 (from 2.17.0), I have been experiencing random X crashes. I can't tell what exactly triggers X server crash but it always happens pretty soon after KDE startup. I'm using libdrm 2.4.30 though the crash is reproducible with libdrm 2.4.31 as well. Below you will find a backtrace and lspci output. Nothing interesting gets written to Xorg.0.log. If you need more information, feel free to ask. P.S. I'm aware that crash is in libdrm. However, 2.17.0 used to work fine with the same libdrm. So I'm not sure where the actual bug is. (gdb) bt #0 0x00007f8b75312475 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f8b753156f0 in *__GI_abort () at abort.c:92 #2 0x00007f8b7530b621 in *__GI___assert_fail (assertion=0x7f8b730d2f61 "bo_gem->map_count == 0", file=<optimized out>, line=960, function=0x7f8b730d3160 "drm_intel_gem_bo_purge_vma_cache") at assert.c:81 #3 0x00007f8b730ceb6f in drm_intel_gem_bo_purge_vma_cache (bufmgr_gem=0x7f8b785c3310) at intel_bufmgr_gem.c:960 #4 0x00007f8b730d10d3 in drm_intel_gem_bo_map_gtt (bo=0x7f8b7c93b4e0) at intel_bufmgr_gem.c:1160 #5 0x00007f8b732e79a4 in intel_uxa_pixmap_put_image (pixmap=<optimized out>, src=0x7f8b7c93d450 '"' <repeats 122 times>, src_pitch=124, x=0, y=<optimized out>, w=122, h=19) at ../../src/intel_uxa.c:772 #6 0x00007f8b732e96c7 in intel_uxa_put_image (pixmap=0x7f8b7c93d3c0, x=0, y=0, w=<optimized out>, h=19, src=0x7f8b7c93d450 '"' <repeats 122 times>, src_pitch=124) at ../../src/intel_uxa.c:837 #7 0x00007f8b733108a2 in uxa_picture_from_pixman_image (format=PIXMAN_a8, image=0x7f8b7c938680, screen=0x7f8b785ca580) at ../../uxa/uxa-render.c:534 #8 uxa_trapezoids (op=3 '\003', src=0x7f8b7a4038d0, dst=0x7f8b7c6d99d0, maskFormat=<optimized out>, xSrc=1017, ySrc=160, ntrap=<optimized out>, traps=<optimized out>) at ../../uxa/uxa-render.c:2001 #9 0x00007f8b773bbba1 in ProcRenderTrapezoids (client=0x7f8b7c616210) at ../../render/render.c:783 #10 0x00007f8b772fff81 in Dispatch () at ../../dix/dispatch.c:437 #11 0x00007f8b772ef1aa in main (argc=8, argv=<optimized out>, envp=<optimized out>) at ../../dix/main.c:287 (gdb) bt full #0 0x00007f8b75312475 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 pid = <optimized out> selftid = <optimized out> #1 0x00007f8b753156f0 in *__GI_abort () at abort.c:92 act = {__sigaction_handler = {sa_handler = 0x7f8b730d2f61, sa_sigaction = 0x7f8b730d2f61}, sa_mask = { __val = {140236944484584, 140734302203024, 960, 140734302203264, 140236943548550, 206158430232, 140734302203280, 140734302203056, 140236943447112, 206158430256, 140734302203304, 140237067225312, 263376, 4480357486596943714, 7959390389040865645, 140734302210458}}, sa_flags = 1967294783, sa_restorer = 0x7f8b730d2f4e} sigs = {__val = {32, 0 <repeats 15 times>}} #2 0x00007f8b7530b621 in *__GI___assert_fail (assertion=0x7f8b730d2f61 "bo_gem->map_count == 0", file=<optimized out>, line=960, function=0x7f8b730d3160 "drm_intel_gem_bo_purge_vma_cache") at assert.c:81 buf = 0x7f8b7c9388e0 "X: intel_bufmgr_gem.c:960: drm_intel_gem_bo_purge_vma_cache: Assertion `bo_gem->map_count == 0' failed.\n" #3 0x00007f8b730ceb6f in drm_intel_gem_bo_purge_vma_cache (bufmgr_gem=0x7f8b785c3310) at intel_bufmgr_gem.c:960 bo_gem = 0x7f8b785c38f8 limit = 508 __FUNCTION__ = "drm_intel_gem_bo_purge_vma_cache" __PRETTY_FUNCTION__ = "drm_intel_gem_bo_purge_vma_cache" ---Type <return> to continue, or q <return> to quit--- #4 0x00007f8b730d10d3 in drm_intel_gem_bo_map_gtt (bo=0x7f8b7c93b4e0) at intel_bufmgr_gem.c:1160 bufmgr_gem = 0x7f8b785c3310 bo_gem = 0x7f8b7c93b4e0 set_domain = {handle = 2019125584, read_domains = 32651, write_domain = 1108815584} ret = <optimized out> __PRETTY_FUNCTION__ = "drm_intel_gem_bo_map_gtt" #5 0x00007f8b732e79a4 in intel_uxa_pixmap_put_image (pixmap=<optimized out>, src=0x7f8b7c93d450 '"' <repeats 122 times>, src_pitch=124, x=0, y=<optimized out>, w=122, h=19) at ../../src/intel_uxa.c:772 priv = 0x7f8b7c93a560 stride = 128 cpp = <optimized out> ret = 0 #6 0x00007f8b732e96c7 in intel_uxa_put_image (pixmap=0x7f8b7c93d3c0, x=0, y=0, w=<optimized out>, h=19, src=0x7f8b7c93d450 '"' <repeats 122 times>, src_pitch=124) at ../../src/intel_uxa.c:837 intel = 0x3206 tiling = 1983298042 size = <optimized out> stride = 32651 bo = <optimized out> screen = <optimized out> priv = 0x7f8b7c93a560 ---Type <return> to continue, or q <return> to quit--- #7 0x00007f8b733108a2 in uxa_picture_from_pixman_image (format=PIXMAN_a8, image=0x7f8b7c938680, screen=0x7f8b785ca580) at ../../uxa/uxa-render.c:534 uxa_screen = 0x7f8b732e9420 picture = 0x7f8b7c938100 pixmap = <optimized out> width = 122 height = 19 #8 uxa_trapezoids (op=3 '\003', src=0x7f8b7a4038d0, dst=0x7f8b7c6d99d0, maskFormat=<optimized out>, xSrc=1017, ySrc=160, ntrap=<optimized out>, traps=<optimized out>) at ../../uxa/uxa-render.c:2001 scratch = 0x0 yDst = 160 xRel = <optimized out> width = 122 height = 19 mask = <optimized out> yRel = <optimized out> xDst = 1017 image = 0x7f8b7c938680 format = PIXMAN_a8 screen = 0x7f8b785ca580 uxa_screen = 0x0 bounds = {x1 = 1017, y1 = 160, x2 = 1139, y2 = 179} ---Type <return> to continue, or q <return> to quit--- direct = <optimized out> #9 0x00007f8b773bbba1 in ProcRenderTrapezoids (client=0x7f8b7c616210) at ../../render/render.c:783 rc = <optimized out> ntraps = <optimized out> pSrc = 0x7f8b7a4038d0 pDst = 0x7f8b7c6d99d0 pFormat = 0x7f8b7a120f28 stuff = 0x7f8b7a375e04 #10 0x00007f8b772fff81 in Dispatch () at ../../dix/dispatch.c:437 clientReady = 0x7f8b7a34b7c0 result = <optimized out> client = 0x7f8b7c616210 nready = 0 icheck = 0x7f8b776b4ad0 start_tick = 400 #11 0x00007f8b772ef1aa in main (argc=8, argv=<optimized out>, envp=<optimized out>) at ../../dix/main.c:287 i = <optimized out> alwaysCheckForInput = {0, 1} (gdb) 00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 12) (prog-if 00 [VGA controller]) Subsystem: Giga-byte Technology Device d000 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 48 Region 0: Memory at fb400000 (64-bit, non-prefetchable) [size=4M] Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M] Region 4: I/O ports at ff00 [size=8] Expansion ROM at <unassigned> [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee0f00c Data: 41a1 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [a4] PCI Advanced Features AFCap: TP+ FLR+ AFCtrl: FLR- AFStatus: TP- Kernel driver in use: i915
Hello, how can I help in solving this issue? Would bisecting the commit which introduces the crash help?
No, I've searched through the code for a logically inconsistency and the assert looks valid. Other than finding the path that leads to this error, valgrind might be our best hope.
That's what I came up with. Weirdly enough, valgrind errors are not always reproducible even if X crashes. linux 3.2 / libdrm 2.4.31 / intel driver 2.18.0 Hope it helps because 2.18.0 is basically unusable for me...
Created attachment 58606 [details] Valgrind traces
If it's that bad, you can always switch to SNA. Stable and faster, not a bad combination :-p Thanks for the traces, looks like we have a double free.
Also looks like you can prevent the crash using Section "Device" Driver "intel" Option "BufferCache" "False" EndSection
Also can you try diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c index 3c91090..c6ab51e 100644 --- a/intel/intel_bufmgr_gem.c +++ b/intel/intel_bufmgr_gem.c @@ -907,6 +907,8 @@ drm_intel_gem_bo_free(drm_intel_bo *bo) struct drm_gem_close close; int ret; + assert(atomic_read(&bo_gem->refcount) == 0); + DRMLISTDEL(&bo_gem->vma_list); if (bo_gem->mem_virtual) { VG(VALGRIND_FREELIKE_BLOCK(bo_gem->mem_virtual, 0)); And see if we hit that assertion.
I have no idea what changed but X no longer crashes on startup with the same configuration. However, after running such non-crashing X for some 14 hours, I have started seeing display anomalies like flickering, screen not fully redrawing etc. but no crashes. So clearly there is still something wrong but the bug just manifests itself in the different way... Anyway, right before X stopped crashing (without xorg.conf file), I tried the xorg.conf config file you suggested. All times I tried X ended up in the infinite loop (see below)... I will keep trying to reproduce the crash though. (gdb) bt #0 0x00007f287cdc0a9a in drm_intel_gem_bo_purge_vma_cache (bufmgr_gem=0x7f28830d3860) at intel_bufmgr_gem.c:960 #1 0x00007f287cdc30d3 in drm_intel_gem_bo_map_gtt (bo=0x7f28854edbe0) at intel_bufmgr_gem.c:1160 #2 0x00007f287cfd99a4 in intel_uxa_pixmap_put_image (pixmap=<optimized out>, src=0x7f288546b240 "", src_pitch=8, x=0, y=<optimized out>, w=8, h=8) at ../../src/intel_uxa.c:772 #3 0x00007f287cfdb6c7 in intel_uxa_put_image (pixmap=0x7f2885366c30, x=0, y=0, w=<optimized out>, h=8, src=0x7f288546b240 "", src_pitch=8) at ../../src/intel_uxa.c:837 #4 0x00007f287d0028a2 in uxa_picture_from_pixman_image (format=PIXMAN_a8, image=0x7f28854edac0, screen=0x7f28830dab80) at ../../uxa/uxa-render.c:534 #5 uxa_trapezoids (op=8 '\b', src=0x7f288539f0d0, dst=0x7f28854e7c10, maskFormat=<optimized out>, xSrc=7, ySrc=3, ntrap=<optimized out>, traps=<optimized out>) at ../../uxa/uxa-render.c:2001 #6 0x00007f28810aebf1 in ProcRenderTrapezoids (client=0x7f288521a2c0) at ../../render/render.c:783 #7 0x00007f2880ff2f81 in Dispatch () at ../../dix/dispatch.c:439 #8 0x00007f2880fe21aa in main (argc=8, argv=<optimized out>, envp=<optimized out>) at ../../dix/main.c:287 (gdb) display bufmgr_gem->vma_count 1: bufmgr_gem->vma_count = 509 (gdb) display limit 2: limit = 508 (gdb) n 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 968 bo_gem->mem_virtual = NULL; 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 954 if (limit < 0) 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 957 while (bufmgr_gem->vma_count > limit) { 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 960 bo_gem = DRMLISTENTRY(drm_intel_bo_gem, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 968 bo_gem->mem_virtual = NULL; 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 954 if (limit < 0) 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 957 while (bufmgr_gem->vma_count > limit) { 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 960 bo_gem = DRMLISTENTRY(drm_intel_bo_gem, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 968 bo_gem->mem_virtual = NULL; 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 954 if (limit < 0) 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 957 while (bufmgr_gem->vma_count > limit) { 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 960 bo_gem = DRMLISTENTRY(drm_intel_bo_gem, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 968 bo_gem->mem_virtual = NULL; 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 954 if (limit < 0) 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 957 while (bufmgr_gem->vma_count > limit) { 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 960 bo_gem = DRMLISTENTRY(drm_intel_bo_gem, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 961 bufmgr_gem->vma_cache.next, 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) 963 assert(bo_gem->map_count == 0); 1: bufmgr_gem->vma_count = 509 2: limit = 508 (gdb) finish Run till exit from #0 drm_intel_gem_bo_purge_vma_cache (bufmgr_gem=0x7f28830d3860) at intel_bufmgr_gem.c:961
Another impossible condition. If the infinite loop is easy to reproduce, can you try it under valgrind? As we reduce the number of pieces of code holding onto references, we might be able to see where the error occurs.
Nope, unfortunately, I can't reproduce the infinite loop either anymore... sigh, I really don't understand what happened. What I found out is that I get display distortions once OpenGL screensaver starts when kwin (KDE window manager) compositing is on. Stopping and resuming compositing cures the problem. I ran this screensaver issue under valgrind and nothing interesting popped up :/ OpenGL screensaver causes no trouble with 2.17.0 though.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-intel/issues/11.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.