Bug 105237

Summary: xcb_wait_for_reply makes valgrind report memory leaks on Mesa driver
Product: XCB Reporter: Jiancong <94389147>
Component: LibraryAssignee: xcb mailing list dummy <xcb>
Status: RESOLVED INVALID QA Contact: xcb mailing list dummy <xcb>
Severity: critical    
Priority: medium    
Version: 1.11   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Jiancong 2018-02-25 08:28:56 UTC
Hi all,
   I used valgrind to test my app's memory leak on ubuntu 16.04/amd64. In the report, it gives me following callstacks about leaking.

==3307== 324 bytes in 9 blocks are definitely lost in loss record 704 of 901
==3307==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3307==    by 0xC23314B: ??? (in /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0)
==3307==    by 0xC230ED0: ??? (in /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0)
==3307==    by 0xC232616: ??? (in /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0)
==3307==    by 0xC232720: xcb_wait_for_reply (in /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0)
==3307==    by 0xFFE21F1: dri3_update_drawable.isra.9 (loader_dri3_helper.c:1205)
==3307==    by 0xFFE361C: loader_dri3_get_buffers (loader_dri3_helper.c:1556)
==3307==    by 0x11CAC63C: intel_update_image_buffers (brw_context.c:1707)
==3307==    by 0x11CAC63C: intel_update_renderbuffers (brw_context.c:1383)
==3307==    by 0x11CACB10: intel_prepare_render (brw_context.c:1404)
==3307==    by 0x11CACBD2: intelMakeCurrent (brw_context.c:1242)
==3307==    by 0x11C6C9B5: driBindContext (dri_util.c:585)
==3307==    by 0xFFDDC95: dri3_bind_context (dri3_glx.c:199)

It seems the xcb_wait_for_reply calling caused the memory leak on my machine. I delved into direct caller function in loader_dri3_helper.c:1205.
It's as following,

.....
1203       geom_cookie = xcb_get_geometry(draw->conn, draw->drawable);
1204 
1205       geom_reply = xcb_get_geometry_reply(draw->conn, geom_cookie, NULL);
1206 
1207       if (!geom_reply) {
1208          mtx_unlock(&draw->mtx);
1209          return false;
1210       }
1211 
1212       draw->width = geom_reply->width;
1213       draw->height = geom_reply->height;
1214       draw->depth = geom_reply->depth;
1215       draw->vtable->set_drawable_size(draw, draw->width, draw->height);
1216 
1217       free(geom_reply);
....

These is a free(geom_reply) to release resource in 1217 line. And I don't know why valgrind doesn't count the free calling. It seems xcb_get_geometry_reply/xcb_wait_for_reply not really reply and hanging there?

Jiancong
Comment 1 Michel Dänzer 2018-02-26 10:35:15 UTC
This means it's not about geom_reply but some other memory allocated by XCB internally. Please try again with debugging symbols available for /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0, so we can see what it's about.
Comment 2 Maxim Filippenko 2018-02-28 16:12:47 UTC
I have a similar problem in my project. I build last xcb (version 1.12) and mesa (version 17.2.0-devel) with -ggdb and -fsanitize=address (use gcc sanitizer).

Indirect leak of 689436 byte(s) in 19151 object(s) allocated from:
    #0 0x7f9af0728602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x7f9aec122424 in read_packet /home/filippenko/Soft/libxcb-1.12/src/xcb_in.c:259
    #2 0x7f9aec12613a in _xcb_in_read /home/filippenko/Soft/libxcb-1.12/src/xcb_in.c:1012
    #3 0x7f9aec11e560 in _xcb_conn_wait /home/filippenko/Soft/libxcb-1.12/src/xcb_conn.c:515
    #4 0x7f9aec1238bd in wait_for_reply /home/filippenko/Soft/libxcb-1.12/src/xcb_in.c:516
    #5 0x7f9aec123b0a in xcb_wait_for_reply /home/filippenko/Soft/libxcb-1.12/src/xcb_in.c:546
    #6 0x7f9ae9faf64e in xcb_dri2_swap_buffers_reply /home/filippenko/Soft/libxcb-1.12/src/dri2.c:893
    #7 0x7f9aeec7cbcb in dri2_x11_swap_buffers_msc (lib/libEGL.so.1+0x1bbcb)
    #8 0x7f9aeec7ccce in dri2_x11_swap_buffers (lib/libEGL.so.1+0x1bcce)
    #9 0x7f9aeec77e89 in dri2_swap_buffers (lib/libEGL.so.1+0x16e89)
    #10 0x7f9aeec6b202 in eglSwapBuffers (lib/libEGL.so.1+0xa202)

Inside method xcb_wait_for_reply events are added to the linked list, but nobody reads this list.

I tried to add a cleanup of this list:
=============================================
diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c
index b01f739..1d1271d 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -869,6 +869,12 @@ dri2_x11_swap_buffers_msc(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw,

    reply = xcb_dri2_swap_buffers_reply(dri2_dpy->conn, cookie, NULL);

+   xcb_generic_event_t * event = xcb_poll_for_event(dri2_dpy->conn);
+   while (event) {
+      free(event);
+      event = xcb_poll_for_event(dri2_dpy->conn);
+   }
+
    if (reply) {
       swap_count = (((int64_t)reply->swap_hi) << 32) | reply->swap_lo;
       free(reply);
===============================================

After that the problem (memory leak) is not observed yet.
Comment 3 Uli Schlachter 2018-03-03 08:55:56 UTC
I'm closing this as invalid since I do not see what the bug is supposed to be. The information provided so far indicates that an event was received from the X11 server, but apparently nothing handles events. That's not a bug in XCB (and I fail to see how an X11 app can work that does not handle any events). Feel free to provide more information (for example what "my app" is exactly), but so far I have to conclude that everything works as intended. Sorry.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.