Bug 99164 - Xorg double free in FlushClient
Summary: Xorg double free in FlushClient
Status: RESOLVED DUPLICATE of bug 99887
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/General (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Xorg Project Team
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-12-20 20:28 UTC by Nick Sarnie
Modified: 2017-03-02 02:39 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
bt (6.67 KB, text/plain)
2016-12-20 20:28 UTC, Nick Sarnie
no flags Details
Xorg-Valgrind (736.41 KB, text/plain)
2017-01-10 18:33 UTC, Nick Sarnie
no flags Details
Xorg-valgrind-modesetting (297.80 KB, text/plain)
2017-01-11 18:40 UTC, Nick Sarnie
no flags Details
Xorg-valgrind-modesetting-noltc (313.14 KB, text/plain)
2017-01-13 18:36 UTC, Nick Sarnie
no flags Details

Description Nick Sarnie 2016-12-20 20:28:55 UTC
Created attachment 128597 [details]
bt

Hi.

I've been experiencing this issue for a while now. Xorg will crash randomly, and I finally got the backtrace for it. It is attached. Please let me know if you need anything else.

Xorg version: Git
Input driver: Happens both with libinput and evdev, but using libinput now
Mesa version: Git
Kernel: 4.9
Distro: Gentoo

Thanks,
Sarnex
Comment 1 Michel Dänzer 2017-01-10 07:59:15 UTC
Any chance you can reproduce this while running Xorg in valgrind? That should tell us where the memory was already freed.
Comment 2 Olivier Fourdan 2017-01-10 10:53:17 UTC
I suspect this is the same issue reported here:

https://lists.x.org/archives/xorg-devel/2016-October/051557.html
https://lists.x.org/archives/xorg-devel/2016-December/052004.html

At least, it's a memory corruption in the same code path.
Comment 3 Nick Sarnie 2017-01-10 18:33:52 UTC
Created attachment 128866 [details]
Xorg-Valgrind

Hi Michel,

I think I got it. Let me know if you find anything.

Thanks,
Sarnex
Comment 4 Michel Dänzer 2017-01-11 09:57:09 UTC
Looks like there are some invalid writes in intel driver code. Is this problem reproducible with the modesetting driver?
Comment 5 Nick Sarnie 2017-01-11 18:22:16 UTC
Hi Michel,

First off, my main gpu is amdgpu and the intel gpu is integrated graphics. I tried making the integrated card use modesetting, and have not been able to reproduce the X crash. The only thing I've seen is plasmashell crash and restart when trying to reproduce this issue, which I had never seen before. Maybe the issue was plasmashell bringing down X with it if using the intel DDX. I'll let you know if I get the crash since sometimes it's hard to reproduce, but it seems like that was it.
Comment 6 Nick Sarnie 2017-01-11 18:40:24 UTC
Created attachment 128895 [details]
Xorg-valgrind-modesetting

Classic, right after I post that it's working I try one last time and I get the crash -- sorry.

Thanks,
Sarnex
Comment 7 Michel Dänzer 2017-01-13 09:40:36 UTC
The part below looks like WriteToClient retrieves a ConnectionOutputPtr from FreeOutputs after it was freed by CloseDownConnection -> FreeOsBuffers, but I can't see how that could happen... The only thing that looks a bit suspicious is that FreeOsBuffers doesn't set oc->output = NULL, but if that was the problem, I'd expect valgrind to complain about another place first.

BTW, Nick, it looks like your binaries were built with some kind of LTO? Does it also happen without that, and if so, can you get another valgrind output without LTO?

==6926== Invalid read of size 8
==6926==    at 0x58ECDC: WriteToClient (io.c:705)
==6926==    by 0x440C81: WriteEventsToClient (events.c:6000)
==6926==    by 0x440E82: TryClientEvents (events.c:2021)
==6926==    by 0x4445C0: DeliverEventToInputClients (events.c:2170)
==6926==    by 0x4448BB: DeliverEventToWindowMask (events.c:2213)
==6926==    by 0x4448BB: DeliverEventsToWindow (events.c:2277)
==6926==    by 0x445065: DeliverEvents (events.c:2826)
==6926==    by 0x464182: DeleteWindow (window.c:1096)
==6926==    by 0x459C11: doFreeResource (resource.c:880)
==6926==    by 0x45ADBB: FreeClientResources (resource.c:1146)
==6926==    by 0x434ADE: CloseDownClient (dispatch.c:3464)
==6926==    by 0x58FDE0: ospoll_wait (ospoll.c:412)
==6926==    by 0x589112: WaitForSomething (WaitFor.c:226)
==6926==  Address 0x19eb2960 is 0 bytes inside a block of size 24 free'd
==6926==    at 0x4C2D12B: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6926==    by 0x58C58E: CloseDownConnection (connection.c:919)
==6926==    by 0x434C25: CloseDownClient (dispatch.c:3438)
==6926==    by 0x58FDE0: ospoll_wait (ospoll.c:412)
==6926==    by 0x589112: WaitForSomething (WaitFor.c:226)
==6926==    by 0x4354F0: Dispatch (dispatch.c:412)
==6926==    by 0x4397F7: dix_main (main.c:287)
==6926==    by 0x6ADA67F: (below main) (libc-start.c:289)
==6926==  Block was alloc'd at
==6926==    at 0x4C2BEFF: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6926==    by 0x58ED1B: AllocateOutputBuffer (io.c:975)
==6926==    by 0x58ED1B: WriteToClient (io.c:707)
==6926==    by 0x43536A: SendConnSetup (dispatch.c:3668)
==6926==    by 0x43536A: ProcEstablishConnection (dispatch.c:3706)
==6926==    by 0x4356BA: Dispatch (dispatch.c:469)
==6926==    by 0x4397F7: dix_main (main.c:287)
==6926==    by 0x6ADA67F: (below main) (libc-start.c:289)
Comment 8 Nick Sarnie 2017-01-13 18:36:21 UTC
Created attachment 128939 [details]
Xorg-valgrind-modesetting-noltc

Hi Michel,

Thanks for taking a look. My compile flags are "-O2 -pipe -march=native", and I didn't see anything about lto in the compile log, but I added flags to disable it anyway. I added the following flags to my builds of xserver, xf86-video-amdgpu and xf86-video-intel.

CFLAGS="${CFLAGS} -fno-lto -fno-use-linker-plugin"
CXXFLAGS="${CXXFLAGS} -fno-lto -fno-use-linker-plugin"
LDFLAGS="${LDFLAGS} -fno-lto -fno-use-linker-plugin"


I still get the crash, and I've attached the log.
Comment 9 Michel Dänzer 2017-03-01 06:36:51 UTC
If you're using the xf86-input-wacom driver, this is a duplicate of bug 99887. Otherwise, running current xserver Git master or server-1.19-branch might reveal the culprit in the Xorg log file.
Comment 10 Nick Sarnie 2017-03-02 02:39:43 UTC
(In reply to Michel Dänzer from comment #9)
> If you're using the xf86-input-wacom driver, this is a duplicate of bug
> 99887. Otherwise, running current xserver Git master or server-1.19-branch
> might reveal the culprit in the Xorg log file.

Hi Michel,

I am indeed using xf86-input-wacom. I've marked this bug as a duplicate.

Thanks!

*** This bug has been marked as a duplicate of bug 99887 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.