Bug 25343

Summary: Some fonts corrupted and X crashes (occasionally).
Product: xorg Reporter: David Ronis <David.Ronis>
Component: Server/Acceleration/EXAAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: critical    
Priority: medium CC: adf.lists, madman2003, simon.thum
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Screenshot
none
revert problematic changes
none
redo one change none

Description David Ronis 2009-11-29 09:46:22 UTC
I just upgraded xserver & pixman to today's git master (I did this last a few days ago).  Something is really broken--fonts on menu's, in some apps (e.g., qt-based, firefox, gnome-terminal) are garbled but not in others (xterm).  

I've also had a crash on startup (not 100% reproducable).  The log shows the following backtrace:


Backtrace:
0: /usr/bin/X (xorg_backtrace+0x39) [0x80b4af5]
1: /usr/bin/X (0x8048000+0x5e6d0) [0x80a66d0]
2: (vdso) (__kernel_rt_sigreturn+0x0) [0xffffe40c]
3: /usr/lib/libpixman-1.so.0 (0xb74de000+0x563d) [0xb74e363d]
4: /usr/lib/libpixman-1.so.0 (0xb74de000+0x4db15) [0xb752bb15]
5: /usr/lib/libpixman-1.so.0 (0xb74de000+0x4dce3) [0xb752bce3]
6: /usr/lib/libpixman-1.so.0 (0xb74de000+0x18ca9) [0xb74f6ca9]
7: /usr/lib/libpixman-1.so.0 (0xb74de000+0x4286f) [0xb752086f]
8: /usr/lib/libpixman-1.so.0 (0xb74de000+0x4ef18) [0xb752cf18]
9: /usr/lib/libpixman-1.so.0 (0xb74de000+0x4f077) [0xb752d077]
10: /usr/lib/libpixman-1.so.0 (0xb74de000+0x42a5b) [0xb7520a5b]
11: /usr/lib/libpixman-1.so.0 (0xb74de000+0x19972) [0xb74f7972]
12: /usr/lib/libpixman-1.so.0 (0xb74de000+0x47513) [0xb7525513]
13: /usr/lib/libpixman-1.so.0 (0xb74de000+0x19972) [0xb74f7972]
14: /usr/lib/libpixman-1.so.0 (0xb74de000+0xb8fcf) [0xb7596fcf]
15: /usr/lib/libpixman-1.so.0 (0xb74de000+0x19972) [0xb74f7972]
16: /usr/lib/libpixman-1.so.0 (0xb74de000+0x102098) [0xb75e0098]
17: /usr/lib/libpixman-1.so.0 (0xb74de000+0x19972) [0xb74f7972]
18: /usr/lib/libpixman-1.so.0 (pixman_image_composite+0x21f) [0xb7520efe]
19: /usr/lib/xorg/modules/libfb.so (fbComposite+0x1c0) [0xb6f2604e]
20: /usr/lib/xorg/modules/libexa.so (0xb6ef5000+0x12ba7) [0xb6f07ba7]
21: /usr/lib/xorg/modules/libexa.so (0xb6ef5000+0x1084b) [0xb6f0584b]
22: /usr/bin/X (0x8048000+0xd8cc2) [0x8120cc2]
23: /usr/bin/X (CompositePicture+0x18b) [0x81102ac]
24: /usr/bin/X (0x8048000+0xccea0) [0x8114ea0]
25: /usr/bin/X (0x8048000+0xcf2c6) [0x81172c6]
26: /usr/bin/X (0x8048000+0x47b87) [0x808fb87]
27: /usr/bin/X (0x8048000+0x1b673) [0x8063673]
28: /lib/libc.so.6 (__libc_start_main+0xe0) [0xb732b390]
29: /usr/bin/X (0x8048000+0x1b0d1) [0x80630d1]
Segmentation fault at address 0xa54ae774

Fatal server error:
Caught signal 11 (Segmentation fault). Server aborting

I've got alot of xorg built with debugging symbols, so I should be able to get a better backtrace, but for some reason, I can't seem to get X to drop a core file.

Finally, I'm not sure that this is a xserver issue (it could be pixman).  

I'm running on an i686, using an up to date gnome, and the ati driver (also updated to the git master).
Comment 1 David Ronis 2009-11-29 10:31:41 UTC
Actually, make that the radeon driver.
Comment 2 David Ronis 2009-11-29 11:29:14 UTC
Created attachment 31553 [details]
Screenshot

Look at the window borders and/or menu bar.  The text is garbled.
Comment 3 Søren Sandmann Pedersen 2009-11-29 13:17:14 UTC
I'd be interested in seeing that backtrace with pixman symbols. 

Thanks,
Comment 4 David Ronis 2009-11-29 16:23:50 UTC
I don't know how to trigger the crash other than randomly at startup, and, in any event, a core file isn't left behind.  Any suggestions how to get gdb to run from the get-go?   Dropping a core file should be the right way to do this.

I've also been having other issues with pixman (I think) and was able to get a detailed backtrace for it.   Have a look at bug 25270.

One more thing: any reason why Xorg is reporting such a meager backtrace?  gdb's contains much more information (and all the key modules have been built with symbols).


Comment 5 Michel Dänzer 2009-11-30 02:58:20 UTC
(In reply to comment #4)
> I don't know how to trigger the crash other than randomly at startup, and, in
> any event, a core file isn't left behind.

Option "NoTrapSignals" may help.

> One more thing: any reason why Xorg is reporting such a meager backtrace? 

AFAIK it's a limitation of the glibc backtrace facility.

Asuming you have xserver commits 342f3689d17256c92cbfee079d24501d27aa1153, a54c23fe647cb4d610d871094193ae5959606008 and 99d88ef69d5f7dbf99ca605eceb92f42230a89f4, at least the visual corruption might be a regression of one of those. Would be great if you could confirm that and isolate the one which caused the problem.
Comment 6 David Ronis 2009-11-30 21:42:39 UTC
Two things:

1.  Option "NoTrapSignals" doesn't result in a core file.

2.  Reverting all 3 commits "fixes" the problem.   I will try to narrow down which commit is responsible in a day or so.

Comment 7 Daniel Stone 2009-11-30 22:17:48 UTC
On Mon, Nov 30, 2009 at 09:42:39PM -0800, bugzilla-daemon@freedesktop.org wrote:
> 1.  Option "NoTrapSignals" doesn't result in a core file.

To get core dumps you'll have to do two things: first, make sure you're
running it from a root shell -- not with sudo, not running a suid binary
as a user -- and secondly, run 'ulimit -c unlimited'.
Comment 8 Michel Dänzer 2009-12-01 00:09:41 UTC
Maarten, any ideas so far what could be up? Maybe the last change doesn't mix well with classic drivers after all?
Comment 9 Maarten Maathuis 2009-12-01 00:48:40 UTC
Isolating it to a single commit would be nice. One (maybe unrelated) thing i saw is that exaModifyPixmapHeader_classic() doesn't pin a pixmap, which seems very odd.
Comment 10 Julien Cristau 2009-12-01 01:57:01 UTC
> --- Comment #7 from Daniel Stone <daniel@fooishbar.org>  2009-11-30 22:17:48 PST ---
> On Mon, Nov 30, 2009 at 09:42:39PM -0800, bugzilla-daemon@freedesktop.org
> wrote:
> > 1.  Option "NoTrapSignals" doesn't result in a core file.
> 
> To get core dumps you'll have to do two things: first, make sure you're
> running it from a root shell -- not with sudo, not running a suid binary
> as a user -- and secondly, run 'ulimit -c unlimited'.

The -core command line switch does all of that, iirc.
Comment 11 Michel Dänzer 2009-12-01 03:08:24 UTC
*** Bug 25372 has been marked as a duplicate of this bug. ***
Comment 12 Michel Dänzer 2009-12-01 03:10:40 UTC
Maarten, looks like it's indeed commit 99d88ef69d5f7dbf99ca605eceb92f42230a89f4 which triggered it. Was there any problem or something in particular this change was supposed to address?
Comment 13 Maarten Maathuis 2009-12-01 04:47:45 UTC
It was supposed to address the odd situation (this doesn't mean bug) in which you prepare access, but do not have the right pitch set. It made sense to restrict to one place related to fallbacks and one for acceleration. I can make something else, but i do wonder why this is causing problems (in the worst case i would expect it to be redundant) Do you know if the problem is caused in acceleration or fallback code?
Comment 14 Michel Dänzer 2009-12-01 04:55:10 UTC
(In reply to comment #13)
> Do you know if the problem is caused in acceleration or fallback code?

Not sure.
Comment 15 Michel Dänzer 2009-12-01 08:26:06 UTC
Maarten, I see two basic ways forward:

* You work with the bug submitters to fix the problem quickly.

or

* You submit a patch to revert this change on master. If you want to submit a similar change again, you have it tested by the bug submitters first to make sure it doesn't break again.
Comment 16 Maarten Maathuis 2009-12-01 12:50:42 UTC
Created attachment 31634 [details] [review]
revert problematic changes
Comment 17 Maarten Maathuis 2009-12-01 12:51:19 UTC
Created attachment 31635 [details] [review]
redo one change
Comment 18 Maarten Maathuis 2009-12-01 12:54:20 UTC
I was able to reproduce myself as well (forgot this driver still had a classic mode). Any serious hunting will have to wait a bit, until the weekend maybe.
Comment 19 Andy Furniss 2009-12-02 03:17:19 UTC
(In reply to comment #17)
> Created an attachment (id=31635) [details]
> redo a one change
> 

It's working OK for me with the two patches.
Comment 20 David Ronis 2009-12-02 07:29:50 UTC
The git master with the 2 patches seem to work, although I did get some new crashes in mesa/drm applications.   This may be a compiler issue, as I had some ICE's in building mesa at high levels of optimization.

Comment 21 Michel Dänzer 2009-12-02 15:03:15 UTC
*** Bug 25397 has been marked as a duplicate of this bug. ***
Comment 22 Michel Dänzer 2009-12-04 00:09:18 UTC
The fix has finally landed on xserver master.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.