Bug 97065

Summary: memory leak under Xwayland with old sdl1 applications
Product: Wayland Reporter: Hussam Al-Tayeb <ht990332>
Component: XWaylandAssignee: Wayland bug list <wayland-bugs>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: daniel, fourdan
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: first memory map.
second memory map after memory increase
third memory map
forth memory map of xwayland process
valgrind log
gnome-shell log.
[PATCH xserver] xwayland: Plug memleak in frame callbacks
Valgrind log with xwayland patch
valgrind log with debug version of mesa, wayland, and patched xorg-server
[PATCH xserver] present: Free the fake_present OsTimerPtr
valgrind log with both patches

Description Hussam Al-Tayeb 2016-07-24 13:00:25 UTC
Created attachment 125286 [details]
first memory map.

When running old applications under wayland/nouveau, there is a memory leak in Xwayland.
Gtk2 applications show but this it is minimal. memory is not freed when the application exits.
Using sdl1 applications makes the leaks very high, around 300 to 400Kbytes per minute.
I asked on #wayland on freenode and it was suggested I post memory maps.
Comment 1 Hussam Al-Tayeb 2016-07-24 13:00:58 UTC
Created attachment 125287 [details]
second memory map after memory increase
Comment 2 Hussam Al-Tayeb 2016-07-24 13:01:28 UTC
Created attachment 125288 [details]
third memory map
Comment 3 Hussam Al-Tayeb 2016-07-24 13:02:08 UTC
Created attachment 125289 [details]
forth memory map of xwayland process

Note that after exiting eveything, memory is not recovered.
Comment 4 Hussam Al-Tayeb 2016-07-24 13:03:40 UTC
Firefox nightly 50a1 also causes this but only around 200Kbytes per minute. Still memory is not freed on closing firefox.
Comment 5 Link Mauve 2016-07-24 14:55:55 UTC
Hi again Hussam, please also run xrestop and report whether something actually increases there.
Comment 6 Hussam Al-Tayeb 2016-07-24 21:16:34 UTC
(In reply to Link Mauve from comment #5)
> Hi again Hussam, please also run xrestop and report whether something
> actually increases there.

I tried that. Xwayland's memory usage increases without an increase in pixmaps memory count according to xrestop.
Simply using a sdl1, or gtk2 application or firefox slowly makes Xwayland memory usage increase.

When I close firefox, the total pixmap memory usage according to xrestop decreases but xwayland never frees memory. It seems to just use more as I use my computer.
Switching windows sometimes causes an increase. It even suddenly used 200k more while I switched between gnome-terminal and firefox.

In either case, the reading according to xrestop seems to be stable. No increases there unless I open more applications.
Comment 7 Michel Dänzer 2016-07-25 03:37:50 UTC
The most helpful information would be from running Xwayland in valgrind --leak-check=full.
Comment 8 Hussam Al-Tayeb 2016-07-25 05:53:07 UTC
(In reply to Michel Dänzer from comment #7)
> The most helpful information would be from running Xwayland in valgrind
> --leak-check=full.

How? if I kill the current Xwayland, gnome exits.
Comment 9 Michel Dänzer 2016-07-25 07:32:21 UTC
There are a few possibilities:

* You can start a second Xwayland instance, e.g. something like

   valgrind --leak-check=full Xwayland :99 -noreset

  and then make clients connect to it by setting the environment variable
  DISPLAY=:99

* Find out how to make gnome-shell launch Xwayland in valgrind

* Do it in weston instead of in GNOME
Comment 10 Hussam Al-Tayeb 2016-07-25 07:38:29 UTC
(In reply to Michel Dänzer from comment #9)
> There are a few possibilities:
> 
> * You can start a second Xwayland instance, e.g. something like
> 
>    valgrind --leak-check=full Xwayland :99 -noreset
> 
>   and then make clients connect to it by setting the environment variable
>   DISPLAY=:99
> 
> * Find out how to make gnome-shell launch Xwayland in valgrind
> 
> * Do it in weston instead of in GNOME

I asked on irc yesterday if it is possible to start a second Xwayland instance and I was told no.
Now I am glad it is possible to do so.
Thank you. I will try that.
Comment 11 Hussam Al-Tayeb 2016-07-31 13:59:09 UTC
Created attachment 125446 [details]
valgrind log

Not very useful as I still need to recompile arch linux's packages with debug symbols but this is the valgrind log.
Comment 12 Michel Dänzer 2016-08-01 00:51:59 UTC
Looks like there's a leak between Xwayland and libwayland-client:

==6371== 803,664 bytes in 11,162 blocks are definitely lost in loss record 3,443 of 3,443
==6371==    at 0x4C2CA4A: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6371==    by 0x5568B72: ??? (in /usr/lib/libwayland-client.so.0.3.0)
==6371==    by 0x5569094: wl_proxy_marshal_array_constructor_versioned (in /usr/lib/libwayland-client.so.0.3.0)
==6371==    by 0x55693B9: wl_proxy_marshal_constructor (in /usr/lib/libwayland-client.so.0.3.0)
==6371==    by 0x423BDC: ??? (in /usr/bin/Xwayland)
==6371==    by 0x5576A0: BlockHandler (in /usr/bin/Xwayland)
==6371==    by 0x586DBA: WaitForSomething (in /usr/bin/Xwayland)
==6371==    by 0x552A3D: ??? (in /usr/bin/Xwayland)
==6371==    by 0x556C62: ??? (in /usr/bin/Xwayland)
==6371==    by 0x6F0C740: (below main) (in /usr/lib/libc-2.23.so)
Comment 13 Daniel Stone 2016-08-01 10:48:37 UTC
(In reply to Michel Dänzer from comment #12)
> Looks like there's a leak between Xwayland and libwayland-client:
> 
> ==6371== 803,664 bytes in 11,162 blocks are definitely lost in loss record
> 3,443 of 3,443
> ==6371==    at 0x4C2CA4A: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==6371==    by 0x5568B72: ??? (in /usr/lib/libwayland-client.so.0.3.0)
> ==6371==    by 0x5569094: wl_proxy_marshal_array_constructor_versioned (in
> /usr/lib/libwayland-client.so.0.3.0)
> ==6371==    by 0x55693B9: wl_proxy_marshal_constructor (in
> /usr/lib/libwayland-client.so.0.3.0)

IOW, Xwayland is not destroying all the objects it creates. I can't reproduce here; can you please run gnome-shell with WAYLAND_DEBUG=server, and attach the full log in this bug?
Comment 14 Hussam Al-Tayeb 2016-08-01 11:16:35 UTC
Created attachment 125461 [details]
gnome-shell log.
Comment 15 Olivier Fourdan 2016-08-01 14:35:08 UTC
Sorry in advance if this has been asked before but I cannot find the info, which version of Xwayland/Xserver is it?
Comment 16 Hussam Al-Tayeb 2016-08-01 15:44:04 UTC
(In reply to Olivier Fourdan from comment #15)
> Sorry in advance if this has been asked before but I cannot find the info,
> which version of Xwayland/Xserver is it?

Version 1.18.4
Comment 17 Hussam Al-Tayeb 2016-08-01 16:08:36 UTC
Wayland package version is 1.11.0
Comment 18 Olivier Fourdan 2016-08-02 08:57:21 UTC
I think I can reproduce (although the biggest leaks I see seem to come from glamor/mesa, but that's another story...).

There seem to be a similar pattern here:

  60,912 bytes in 846 blocks are definitely lost in loss record 5,820 of 5,833
     at 0x4C2DA60: calloc (vg_replace_malloc.c:711)
     by 0x54926C2: UnknownInlinedFun (wayland-private.h:229)
     by 0x54926C2: proxy_create.isra.0 (wayland-client.c:336)
     by 0x5492B64: create_outgoing_proxy (wayland-client.c:546)
     by 0x5492B64: wl_proxy_marshal_array_constructor_versioned (wayland-client.c:631)
     by 0x5492E89: wl_proxy_marshal_constructor (wayland-client.c:720)
     by 0x4236FC: wl_surface_frame (wayland-client-protocol.h:1873)
     by 0x4236FC: xwl_screen_post_damage (xwayland.c:401)
     by 0x4236FC: block_handler (xwayland.c:489)
     by 0x43E414: BlockHandler (dixutils.c:394)
     by 0x4787BA: WaitForSomething (WaitFor.c:216)
     by 0x4398FD: Dispatch (dispatch.c:359)
     by 0x43DA62: dix_main (main.c:300)
     by 0x7068730: (below main) (libc-start.c:289)


  1,368 bytes in 19 blocks are definitely lost in loss record 5,023 of 5,833
     at 0x4C2DA60: calloc (vg_replace_malloc.c:711)
     by 0x54926C2: UnknownInlinedFun (wayland-private.h:229)
     by 0x54926C2: proxy_create.isra.0 (wayland-client.c:336)
     by 0x5492B64: create_outgoing_proxy (wayland-client.c:546)
     by 0x5492B64: wl_proxy_marshal_array_constructor_versioned (wayland-client.c:631)
     by 0x5492E89: wl_proxy_marshal_constructor (wayland-client.c:720)
     by 0x425C34: wl_surface_frame (wayland-client-protocol.h:1873)
     by 0x425C34: xwl_seat_set_cursor (xwayland-cursor.c:158)
     by 0x740FC57: ffi_call_unix64 (unix64.S:76)
     by 0x740F6B9: ffi_call (ffi64.c:525)
     by 0x5495A9D: wl_closure_invoke (connection.c:949)
     by 0x549283F: dispatch_event.isra.4 (wayland-client.c:1274)
     by 0x5493A13: dispatch_queue (wayland-client.c:1420)
     by 0x5493A13: wl_display_dispatch_queue_pending (wayland-client.c:1662)
     by 0x423562: wakeup_handler (xwayland.c:478)
     by 0x43E54C: WakeupHandler (dixutils.c:423)


  1,008 bytes in 14 blocks are definitely lost in loss record 4,784 of 5,833
     at 0x4C2DA60: calloc (vg_replace_malloc.c:711)
     by 0x54926C2: UnknownInlinedFun (wayland-private.h:229)
     by 0x54926C2: proxy_create.isra.0 (wayland-client.c:336)
     by 0x5492B64: create_outgoing_proxy (wayland-client.c:546)
     by 0x5492B64: wl_proxy_marshal_array_constructor_versioned (wayland-client.c:631)
     by 0x5492E89: wl_proxy_marshal_constructor (wayland-client.c:720)
     by 0x425C34: wl_surface_frame (wayland-client-protocol.h:1873)
     by 0x425C34: xwl_seat_set_cursor (xwayland-cursor.c:158)
     by 0x4C941A: miPointerUpdateSprite (mipointer.c:456)
     by 0x4C9669: miPointerDisplayCursor (mipointer.c:194)
     by 0x4B6508: CursorDisplayCursor (cursor.c:150)
     by 0x53988F: AnimCurDisplayCursor (animcur.c:225)
     by 0x442787: ChangeToCursor (events.c:931)
     by 0x446529: CheckMotion (events.c:3075)
     by 0x54A9D7: ProcessDeviceEvent (exevents.c:1716)


  936 bytes in 13 blocks are possibly lost in loss record 4,692 of 5,833
     at 0x4C2DA60: calloc (vg_replace_malloc.c:711)
     by 0x54926C2: UnknownInlinedFun (wayland-private.h:229)
     by 0x54926C2: proxy_create.isra.0 (wayland-client.c:336)
     by 0x5492B64: create_outgoing_proxy (wayland-client.c:546)
     by 0x5492B64: wl_proxy_marshal_array_constructor_versioned (wayland-client.c:631)
     by 0x5492E89: wl_proxy_marshal_constructor (wayland-client.c:720)
     by 0x4236FC: wl_surface_frame (wayland-client-protocol.h:1873)
     by 0x4236FC: xwl_screen_post_damage (xwayland.c:401)
     by 0x4236FC: block_handler (xwayland.c:489)
     by 0x43E414: BlockHandler (dixutils.c:394)
     by 0x4787BA: WaitForSomething (WaitFor.c:216)
     by 0x4398FD: Dispatch (dispatch.c:359)
     by 0x43DA62: dix_main (main.c:300)
     by 0x7068730: (below main) (libc-start.c:289)


  576 bytes in 8 blocks are definitely lost in loss record 4,239 of 5,833
     at 0x4C2DA60: calloc (vg_replace_malloc.c:711)
     by 0x54926C2: UnknownInlinedFun (wayland-private.h:229)
     by 0x54926C2: proxy_create.isra.0 (wayland-client.c:336)
     by 0x5492B64: create_outgoing_proxy (wayland-client.c:546)
     by 0x5492B64: wl_proxy_marshal_array_constructor_versioned (wayland-client.c:631)
     by 0x5492E89: wl_proxy_marshal_constructor (wayland-client.c:720)
     by 0x425C34: wl_surface_frame (wayland-client-protocol.h:1873)
     by 0x425C34: xwl_seat_set_cursor (xwayland-cursor.c:158)
     by 0x4C941A: miPointerUpdateSprite (mipointer.c:456)
     by 0x4C5366: mieqProcessInputEvents (mieq.c:648)
     by 0x4398EE: Dispatch (dispatch.c:355)
     by 0x43DA62: dix_main (main.c:300)
     by 0x7068730: (below main) (libc-start.c:289)
Comment 19 Olivier Fourdan 2016-08-02 09:01:47 UTC
(forgot to mention, I captured this data using weston as a compositor, not gnome-shell)
Comment 20 Olivier Fourdan 2016-08-02 09:23:04 UTC
Created attachment 125476 [details] [review]
[PATCH xserver] xwayland: Plug memleak in frame callbacks

The frame callback set up via wl_surface_frame() needs to be freed with
wl_callback_destroy() or we'll leak memory.
Comment 21 Hussam Al-Tayeb 2016-08-02 21:03:04 UTC
Ok, I tried the patch. Using one application doesn't result with as much leaks as before. But then open a second application and firefox, and then switch between applications causes random increases the memory used by Xwayland. They seem to accelerate with time as well.
Comment 22 Michel Dänzer 2016-08-03 01:16:11 UTC
(In reply to Hussam Al-Tayeb from comment #21)
> But then open a second application and firefox, and then switch between
> applications causes random increases the memory used by Xwayland.

Note that Xwayland can't do its work without allocating memory :), some of which on behalf of its clients.

If you think there's still a leak, can you get another valgrind log corresponding to it?
Comment 23 Hussam Al-Tayeb 2016-08-03 01:21:57 UTC
(In reply to Michel Dänzer from comment #22)
> (In reply to Hussam Al-Tayeb from comment #21)
> > But then open a second application and firefox, and then switch between
> > applications causes random increases the memory used by Xwayland.
> 
> Note that Xwayland can't do its work without allocating memory :), some of
> which on behalf of its clients.
> 
> If you think there's still a leak, can you get another valgrind log
> corresponding to it?

I understand so. However, shouldn't allocated memory be freed when applications are closed?

I'll post a new valgrind log. I'll see if I can check if the behavior is different under Weston as well.
Comment 24 Pekka Paalanen 2016-08-03 11:21:56 UTC
(In reply to Hussam Al-Tayeb from comment #23)
> I understand so. However, shouldn't allocated memory be freed when
> applications are closed?

I think it depends whether it will actually show up freed in memory consumption numbers. Valgrind OTOH is very good providing information.

How do you check the memory consumption exactly?
Comment 25 Hussam Al-Tayeb 2016-08-06 17:55:24 UTC
(In reply to Pekka Paalanen from comment #24)
> (In reply to Hussam Al-Tayeb from comment #23)
> > I understand so. However, shouldn't allocated memory be freed when
> > applications are closed?
> 
> I think it depends whether it will actually show up freed in memory
> consumption numbers. Valgrind OTOH is very good providing information.
> 
> How do you check the memory consumption exactly?

I was using gnome-terminal and top but I was told how to read it as well from /proc/`pidof Xwayland`/status
Comment 26 Pekka Paalanen 2016-08-08 07:22:13 UTC
(In reply to Hussam Al-Tayeb from comment #25)
> I was using gnome-terminal and top but I was told how to read it as well
> from /proc/`pidof Xwayland`/status

Which column in 'top'?
Which field(s) in the status file?

This is anyway just for additional information, Michel's request for a new Valgrind log is still the most important part.
Comment 27 Olivier Fourdan 2016-08-08 07:31:51 UTC
FWIW, I use the following script to capture the valgrind data with weston (easier than gnome-shell because weston has an option to specify the command to run to launch Xwayland):

1. A bash shell script named "Xwayland-valgrind" with the following content:

#!/bin/bash

exec valgrind --leak-check=full --leak-resolution=high --track-origins=yes --log-file=xwayland.%p $(which Xwayland)  "$@"

2. Make sure the script is executable

3. Edit ~/.config/weston.ini and make the path for "[xwayland]" section points to the script above:

[xwayland]
path=/path/to/Xwayland-valgrind

4. Reproduce the leak using weston as your Wayland compositor (you can even run weston embedded from within a gnome-shell Wayland session, how neat is that ;-) )
Comment 28 Hussam Al-Tayeb 2016-08-08 08:34:51 UTC
Created attachment 125583 [details]
Valgrind log with xwayland patch

New valgrind log.
Sorry for the delay. I opted to wait till I had kernel 4.7 to check for kernel driver improvements.
Comment 29 Olivier Fourdan 2016-08-08 08:44:55 UTC
(In reply to Hussam Al-Tayeb from comment #28)
> Created attachment 125583 [details]
> Valgrind log with xwayland patch
> 
> New valgrind log.
> Sorry for the delay. I opted to wait till I had kernel 4.7 to check for
> kernel driver improvements.

Unfortunately, without the debugging symbols it's hard to make sense of the log, like this:

==1056== 62,400 (208 direct, 62,192 indirect) bytes in 1 blocks are definitely lost in loss record 3,425 of 3,429
==1056==    at 0x4C2CA6A: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1056==    by 0x4AC094: ??? (in /usr/bin/Xwayland)
==1056==    by 0x4AC2EB: ??? (in /usr/bin/Xwayland)
==1056==    by 0x4ABEB2: ??? (in /usr/bin/Xwayland)
==1056==    by 0x4AB2F9: ??? (in /usr/bin/Xwayland)
==1056==    by 0x428042: InitExtensions (in /usr/bin/Xwayland)
==1056==    by 0x556AB6: ??? (in /usr/bin/Xwayland)
==1056==    by 0x6F0D290: (below main) (in /usr/lib/libc-2.24.so)
==1056== 
==1056== 577,728 (560,448 direct, 17,280 indirect) bytes in 17,514 blocks are definitely lost in loss record 3,429 of 3,429
==1056==    at 0x4C2ABDE: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1056==    by 0x586B89: TimerSet (in /usr/bin/Xwayland)
==1056==    by 0x4F1B19: ??? (in /usr/bin/Xwayland)
==1056==    by 0x4F1650: ??? (in /usr/bin/Xwayland)
==1056==    by 0x4F2424: ??? (in /usr/bin/Xwayland)
==1056==    by 0x552C3E: ??? (in /usr/bin/Xwayland)
==1056==    by 0x556C92: ??? (in /usr/bin/Xwayland)
==1056==    by 0x6F0D290: (below main) (in /usr/lib/libc-2.24.so)
==1056== 

Could you please make sure to have the debugging symbols installed and recapture the valgrind logs?
Comment 30 Hussam Al-Tayeb 2016-08-08 09:00:55 UTC
Sure. I can recompile xorg-server (this includes xwayland) with debug symbols enabled.
Is there anything else I should recompile with debug symbols that may be of help as well?
Thank you.
Comment 31 Olivier Fourdan 2016-08-08 12:02:56 UTC
(In reply to Hussam Al-Tayeb from comment #30)
> [...]
> Is there anything else I should recompile with debug symbols that may be of
> help as well?

I'd say xorg-server, Wayland and mesa-dri (not sure exactly how those are called on arch linux)
Comment 32 Hussam Al-Tayeb 2016-08-08 13:55:39 UTC
Created attachment 125592 [details]
valgrind log with debug version of mesa, wayland, and patched xorg-server

A few questions:
1) This mesa installation is 280MB larger on disk that the one without debug symbols. Xorg is only 40MB larger. Is there any performance hit from the debug builds? I would like to keep those.
2) Also is said "glamor: EGL version 1.4 (DRI2):" when I started the second xwayland instance. Can I use DRI3 instead? I heard it is better.

Thanks again.
Comment 33 Olivier Fourdan 2016-08-08 14:28:06 UTC
Created attachment 125596 [details] [review]
[PATCH xserver] present: Free the fake_present OsTimerPtr

Can you try with this patch as well?
Comment 34 Hussam Al-Tayeb 2016-08-08 15:23:26 UTC
Created attachment 125601 [details]
valgrind log with both patches

new valgrind log with both patches.
Comment 35 Olivier Fourdan 2016-08-08 15:25:09 UTC
(In reply to Hussam Al-Tayeb from comment #34)
> new valgrind log with both patches.

That looks good to me, seems that plugs the leak you were seeing...
Comment 36 Hussam Al-Tayeb 2016-08-08 15:30:09 UTC
(In reply to Olivier Fourdan from comment #35)
> (In reply to Hussam Al-Tayeb from comment #34)
> > new valgrind log with both patches.
> 
> That looks good to me, seems that plugs the leak you were seeing...

Yes. Thank you very much!
One more question. Can I use DRI3 on xwayland or is that irrelevant?
Comment 37 Michel Dänzer 2016-08-09 01:00:03 UTC
(In reply to Hussam Al-Tayeb from comment #36)
> One more question. Can I use DRI3 on xwayland or is that irrelevant?

Xwayland clients can use DRI3 (and are, otherwise the leak plugged by Olivier's last patch wouldn't trigger). The DRI2 you saw is about something else and doesn't indicate any problem.
Comment 38 Adam Jackson 2016-08-15 19:19:00 UTC
commit de5291c04b05772e6da599a475baa1b19dcae07a
Author: Olivier Fourdan <ofourdan@redhat.com>
Date:   Mon Aug 8 17:25:35 2016 +0200

    present: Free the fake_present OsTimerPtr

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.