The xorg server crashes reproducibly within the intel driver. To reproduce, use firefox 11 and go to any twitter page, such as http://twitter.com/#!/therealcwilson Program received signal SIGSEGV, Segmentation fault. 0x00007f1211bec837 in inplace_row (width=86, row=0x7fff17f0fbb0 "\377", active=0x7fff17f0f7c0) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:1503 1503 ../xf86-video-intel-9999/src/sna/sna_trapezoids.c: No such file or directory. in ../xf86-video-intel-9999/src/sna/sna_trapezoids.c (gdb) bt #0 0x00007f1211bec837 in inplace_row (width=86, row=0x7fff17f0fbb0 "\377", active=0x7fff17f0f7c0) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:1503 #1 tor_inplace (converter=0x7fff17f0eea0, buf=0x7fff17f0fbb0 "\377", scratch=<optimized out>, mono=<optimized out>) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:1729 #2 0x00007f1211bef9b4 in trapezoid_mask_converter (op=3 '\003', src=<optimized out>, dst=<optimized out>, maskFormat=<optimized out>, src_x=<optimized out>, src_y=<optimized out>, ntrap=16, traps=0x2e4b50c) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:3547 #3 0x00007f1211bf616d in sna_composite_trapezoids (op=<optimized out>, src=<optimized out>, dst=0x2e38b40, maskFormat=0x194f368, xSrc=<optimized out>, ySrc=<optimized out>, ntrap=16, traps=0x2e4b50c) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:4484 #4 0x00000000004fb921 in ProcRenderTrapezoids (client=0x2a4ee70) at /usr/src/debug/x11-base/xorg-server-1.12.0/xorg-server-1.12.0/render/render.c:777 #5 0x0000000000437a91 in Dispatch () at /usr/src/debug/x11-base/xorg-server-1.12.0/xorg-server-1.12.0/dix/dispatch.c:439 #6 0x00000000004264ca in main (argc=<optimized out>, argv=0x7fff17f0ff48, envp=<optimized out>) at /usr/src/debug/x11-base/xorg-server-1.12.0/xorg-server-1.12.0/dix/main.c:287
Doesn't trigger the issue here. Can you please tell me which version of the driver you are currently using?
Xorg.log would be useful for the identification of your system, and valgrind would help identify the bug and/or recompiling with no optimisation and printing the locals.
The crash happens on xorg-server 1.12 (compiled from source on gentoo), with commit 63c0d10faee3c7cca050505c2e81c416119e57e9 of the xf86-video-intel driver. 3D accelleration is provided by a mesa (git-c079574). This is a 64-bit kernel (3.2.11 + tuxonice patches). The desktop environment is KDE, with KWin compositing enabled. Firefox-11 (also compiled from source) seems to trigger this bug on many different sites.
Created attachment 58574 [details] Xorg log
Created attachment 58575 [details] Xorg log with crash backtrace
I've tried this on gen2-6 with firefox 11/12 and with cairo-1.10.2/cairo-1.11.4. And still not reproduced your crash. Can you please just start X under valgrind and launch firefox? And perhaps provide a disassembly of your inplace_row()?
This crash may very well be compiler optimization related. I cannot reproduce it with these flags: CFLAGS="-ggdb -O0 -march=core2 -pipe" CXXFLAGS=${CFLAGS} These flags exhibit the crash behaviour on gcc-4.6.2: CFLAGS="-ggdb -O2 -march=core2 -mssse3 -msse4.1 -mno-sse4.2 -fno-builtin-memcmp -funit-at-a-time -pipe -ftree-vectorize -floop-interchange -floop-strip-mine -floop-block -pipe" CXXFLAGS="${CFLAGS}" I have been able to bisect the code and the first bad commit appears to be http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=fba49e1bb8e5b6b0e3ceace2dbddb5796ece954e sna/traps: Fix off-by-one for filling vertical segments in tor_inplace If the last solid portion was exactly 4-pixels wide, we would miss filling in the mask.
Can you try: commit e31d9dacafe060dc86de801114b475fdd0142eb6 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Mar 17 09:21:00 2012 +0000 sna/traps: Align indices for unrolled memset in row_inplace() The compiler presumes that the uint64_t write is naturally aligned and so may emit code that crashes with an unaligned moved. To workaround this, make sure the write is so aligned. References: https://bugs.freedesktop.org/show_bug.cgi?id=47418 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
With commit e31d9dacafe060dc86de801114b475fdd0142eb6 I can no longer reproduce the crash. Many thanks!
something is still not right in inplace_row (sna_trapezoids.c) With the latest version I get the following crash when starting gimp: Program received signal SIGSEGV, Segmentation fault. 0x00007fe86258e98f in inplace_row (width=25, row=0x2137a0c "\377\377\377\377\377\377\377\377", active=0x7fffe3ff39a0) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:1515 1515 ../xf86-video-intel-9999/src/sna/sna_trapezoids.c: No such file or directory. in ../xf86-video-intel-9999/src/sna/sna_trapezoids.c (gdb) bt #0 0x00007fe86258e98f in inplace_row (width=25, row=0x2137a0c "\377\377\377\377\377\377\377\377", active=0x7fffe3ff39a0) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:1515 #1 tor_inplace (converter=0x7fffe3ff3180, buf=0x0, scratch=<optimized out>, mono=<optimized out>) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:1752 #2 0x00007fe862593f12 in trapezoid_span_fallback (op=3 '\003', src=<optimized out>, dst=<optimized out>, maskFormat=<optimized out>, src_x=<optimized out>, src_y=<optimized out>, ntrap=9, traps=0x20bb654) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:4377 #3 0x00007fe86259815d in sna_composite_trapezoids (op=<optimized out>, src=<optimized out>, dst=0x1c36e60, maskFormat=0x19973c8, xSrc=<optimized out>, ySrc=<optimized out>, ntrap=9, traps=0x20bb654) at ../xf86-video-intel-9999/src/sna/sna_trapezoids.c:4570 #4 0x00000000004fb921 in ProcRenderTrapezoids (client=0x1c38de0) at /usr/src/debug/x11-base/xorg-server-1.12.0-r1/xorg-server-1.12.0/render/render.c:777 #5 0x0000000000437a91 in Dispatch () at /usr/src/debug/x11-base/xorg-server-1.12.0-r1/xorg-server-1.12.0/dix/dispatch.c:439 #6 0x00000000004264ca in main (argc=<optimized out>, argv=0x7fffe3ff40d8, envp=<optimized out>) at /usr/src/debug/x11-base/xorg-server-1.12.0-r1/xorg-server-1.12.0/dix/main.c:287
Just to check, this is still with -O3 etc? I think the culprit this time is that the row itself is not 8-byte aligned going into the function: row=0x2137a0c. Let me change the alignment preamble to take that into account.
Can you try...? commit ee075ced844350785685a0f93f88f1dc310bcc73 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Mar 30 19:09:30 2012 +0100 sna/traps: Align the pointer not the indices Magnus found that inplace_row was still crashing on his setup when it tried to perform an 8-byte aligned write to an unaligned pointer. This time it looks like the row pointer itself was not 8-byte aligned, so instead of assuming that and fixing up the indices, ensure that the (index+row) results in an 8-byte aligned value. Reported-by: Magnus Kessler <Magnus.Kessler@gmx.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47418 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
No luck. The crash still happens at line 1515 (i.e. right after the changes made in this commit). And yes, this is still with the optimiizations mentioned previously.
Hmm, can you attach gdb and paste another backtrace along with a disassembly and as many locals as gdb can find?
Created attachment 59289 [details] [review] Align the row[index] I've proven myself an imbecile too often tonight, so lets test this patch first. :(
This patch fixes gimp crashing on startup. For that, it's a Tested-by: Magnus Kessler <Magnus.Kessler@gmx.net> However, the drop-down menus in gimp show severe rendering artefacts, both in their text and icons. It looks mostly like every second row and column is missing there, but sometimes multiple rows or columns are left blank. The menu entries get their normal look back once the mouse moves over them. I'm not sure if this is a consequence of your fix, or a completely different issue. Other GTK+ applications (notably Firefox) look OK.
Thanks. commit 6f2814db6f7b89e94e54b8d73c7e176ab7d1c469 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Mar 30 20:45:55 2012 +0100 sna/traps: Align the pointer+index It's the location of the pixels within the row that matter for alignment! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47418 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Magnus Kessler <Magnus.Kessler@gmx.net> I had a quick look at the gimp, poking various menus and drop-down lists, nothing struck me out of the ordinary. First can you try running without optimisations and see if still occurs? Are you able to capture it in a screenshot (just so I can be sure I was poking in the right places!)?
Created attachment 59303 [details] Rendering artefacts in gimp menu The rendering artefacts appear with or without optimization. They come in a variety of patterns, some of which appear in the attached screenshot. I have observed horizontal, vertical, and even diagonal stripes of missing pixels.
That's a definitely a different level of corruption. And the damage is persistent (until re-rendered) so data loss on upload I'd guess. Can you grab a whole screen shot so I can try to work out which areas are most affected (and so any pattern behind the corruption)? Also can you compile with --enable-debug=full and attach the full Xorg.0.log for a gimp session (or the last 1 MiB I guess will do)? And if you could compile http://cgit.freedesktop.org/~ickle/linux-2.6/ using the vmap branch and xf86-video-intel with --enable-sna --enable-vmap that would help with one query. (I'll see if I can reproduce this on a stock kernel as well.)
I went back to a stock ubuntu kernel (3.2.0) without vmap on 965gm (which should be close enough to your gm45) and nothing unusual happened. Can I ask you dig a little deeper and see if you can find the trigger? Try different WM, different themes and different kernels.
Can you please retest with commit 7f0bede3e7e3f92a637d1c886304b16afc0e34f2 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Apr 9 10:48:08 2012 +0100 sna/traps: Use a temporary variable for the write pointer Though as I have not reproduced the corruption you've seen, I can't be certain if this is related. Hopefully it is if the corruption only started after the introduction of tor_inplace().
(In reply to comment #21) > Can you please retest with > > commit 7f0bede3e7e3f92a637d1c886304b16afc0e34f2 > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Mon Apr 9 10:48:08 2012 +0100 > > sna/traps: Use a temporary variable for the write pointer No, this commit doesn't make any difference regarding the corruption I'm seeing.
*** Bug 48619 has been marked as a duplicate of this bug. ***
Looks like the common element here is KDE? If you try a different WM (and DE) does the issue persist?
As an aside, having chased yet another bug due to an interaction with gcc, can either of you confirm if this bug is still present if you recompile the xserver and the xf86-video-intel (maybe even pixman) with -O0?
The problems with font rendering appear indeed to be related to the use of compositing in KDE's kwin window manager. If I use no window manager, or even turn compositing off in kwin, the font corruption no longer appears. The optimization level in gcc makes no difference, and xorg-server, xf86-video-intel and pixman compiled with -O0 have the same issue.
Hmm, can either of you reproduce with ./configure --enable-sna --enable-debug=full and attach the resulting Xorg.0.log (it will be huge)?
Gentoo bug https://bugs.gentoo.org/show_bug.cgi?id=409593 suggests, that the font corruption is due to some changes in cairo after 1.11.2. And indeed, after downgrading cairo to 1.11.2, I no longer observe the problem, even with compositing enabled in kwin. The gentoo bug points to https://bugs.freedesktop.org/show_bug.cgi?id=47266#c142, which claims to have bisected to cairo commit af9fbd176b145f042408ef5391eef2a51d7531f8 ("Introduce a new compositor architecture")
(In reply to comment #28) > Gentoo bug https://bugs.gentoo.org/show_bug.cgi?id=409593 suggests, that the > font corruption is due to some changes in cairo after 1.11.2. And indeed, after > downgrading cairo to 1.11.2, I no longer observe the problem, even with > compositing enabled in kwin. They are not bugs in cairo, but do suggest which upload path is going wrong.
I did make some tweaks the upload buffers and idle detection which I feel at least touch the implication code paths here, so I'd appreciate if you could give me a status update on the occurrence of this bug? Thanks.
All menus in gimp-2.6.x now render correctly with current versions of xf86-video-intel, libdrm and mesa.
There are a number of standout commits in the interval between tests. However, I'm going to tentatively take this as finally fixed. Thanks for the bug report and all the testing! Keep your eyes peeled for further issues...
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.