Bug 50554 - [snb sna] Artifacts while using Matlab
Summary: [snb sna] Artifacts while using Matlab
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: Other Linux (All)
: medium normal
Assignee: Chris Wilson
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-31 12:29 UTC by Christoph Reiter
Modified: 2012-10-24 11:28 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
library-blocks (81.99 KB, image/png)
2012-05-31 12:29 UTC, Christoph Reiter
no flags Details
text garbage (40.15 KB, image/png)
2012-05-31 12:29 UTC, Christoph Reiter
no flags Details
X failed (6.33 KB, text/plain)
2012-06-01 00:59 UTC, Christoph Reiter
no flags Details
dmesg linux 3.4 (58.16 KB, text/plain)
2012-06-01 00:59 UTC, Christoph Reiter
no flags Details
gdm log (5.11 KB, text/plain)
2012-06-02 00:38 UTC, Christoph Reiter
no flags Details
gdm log 2 (6.94 KB, text/plain)
2012-06-02 01:35 UTC, Christoph Reiter
no flags Details
full backtrace (4.68 KB, text/plain)
2012-06-02 03:22 UTC, Christoph Reiter
no flags Details

Description Christoph Reiter 2012-05-31 12:29:18 UTC
Created attachment 62345 [details]
library-blocks

UXA is OK, SNA is not. See attachments.

Two kinds of artifacts:
 - Single color blocks in the library browser
 - Text garbage with long lines in the code editor

While editing some text in the (Matlab) editor the X server crashed two times today and Matlab once.

 - Debian unstable
 - Metacity with Compositing enabled
 - Sandy Bridge:  Intel Corporation 2nd Generation Core Processor Family
Integrated Graphics Controller (rev 09)
 - 3.3 kernel
 - xf86-video-intel: dcc7ba8ccf95db1c265b (Thu May 31 17:48:40 2012)
Comment 1 Christoph Reiter 2012-05-31 12:29:51 UTC
Created attachment 62346 [details]
text garbage
Comment 2 Chris Wilson 2012-05-31 12:41:47 UTC
A few things: can you try and grab a 3.4 kernel, and attach an Xorg.0.log with --enable-debug=full and a dmesg?
Comment 3 Christoph Reiter 2012-06-01 00:59:04 UTC
Created attachment 62361 [details]
X failed

Linux debian 3.4.0 #2 SMP Fri Jun 1 09:05:11 CEST 2012 x86_64 GNU/Linux

X doesn't come up with --enable-debug=full
Comment 4 Christoph Reiter 2012-06-01 00:59:54 UTC
Created attachment 62362 [details]
dmesg linux 3.4
Comment 5 Chris Wilson 2012-06-01 01:28:09 UTC
(In reply to comment #3)
> X doesn't come up with --enable-debug=full

From the truncated log file, I guess a miscompilation that pulled in a missing symbol. If you look at the stderr, it should be reported there.

At the moment, the only way I can hope to have a clue here is through that log file since I have no way of running matlab locally.
Comment 6 Chris Wilson 2012-06-01 11:34:01 UTC
Can you also add testing without a compositing WM to the list?
Comment 7 Christoph Reiter 2012-06-01 13:56:13 UTC
Do you mean stderr of the compilation? (I always build a debian package, maybe there is something missing there?)

Without compositing the blocks are gone, but the text is still weird.
Comment 8 Christoph Reiter 2012-06-01 13:58:26 UTC
Also no blocks with mutter + compositing.
Comment 9 Chris Wilson 2012-06-01 14:28:39 UTC
The failed X log mentions VT 52 which is also weird. Can you also attach your login manager's logfile (e.g. /var/log/gdm/:0.log or /var/log/xdm.log)?
Comment 10 Christoph Reiter 2012-06-02 00:38:49 UTC
Created attachment 62406 [details]
gdm log
Comment 11 Chris Wilson 2012-06-02 00:56:59 UTC
You're trying to start an Xserver from within an Xserver? The initial sanity checks that we can perform modesetting are not going to like that...
Comment 12 Christoph Reiter 2012-06-02 01:35:23 UTC
Created attachment 62408 [details]
gdm log 2

Here is another one.

There are about 4000 log files just for that one start.

There are some more errors: "no free VT", "I/O error".. but I guess they come from X trying to start a few hundred times.
Comment 13 Chris Wilson 2012-06-02 01:45:52 UTC
Go to a VT and start X by hand:
$ sudo service gdm3 stop
$ sudo gdb --args X -ac -noreset
Comment 14 Christoph Reiter 2012-06-02 02:39:09 UTC
Program received signal SIGSEGV, Segmentation fault.
0x00007fe8375ec476 in sna_output_attach_edid (output=0x7fe83c8f5850) at ../../../src/sna/sna_display.c:1108
1108			mon = xf86InterpretEDID(output->scrn->scrnIndex,
Comment 15 Chris Wilson 2012-06-02 02:49:05 UTC
'bt full' please :)
Comment 16 Christoph Reiter 2012-06-02 02:56:05 UTC
I can't access a terminal after starting X in gdb.
That is all I got by "set logging on" and doing a sysrq reboot after some seconds

Never used gdb, I'll give it another try.
Comment 17 Chris Wilson 2012-06-02 03:09:58 UTC
Ah, you could do with a second machine and ssh in or to dump core instead of attaching gdb (ulimit -c 1000000000; X -core ; gdb X $coredump)

As far as I can tell, this is a local issue to your build, something is seriously broken.
Comment 18 Christoph Reiter 2012-06-02 03:22:06 UTC
Created attachment 62415 [details]
full backtrace

"handle SIGSEGV nostop" did it

If you think this is a problem on my side, you can close it, no prob. I'll just keep using uxa.

Thanks for investigating and your fast responses.
Comment 19 Chris Wilson 2012-06-02 03:49:11 UTC
I've pushed a variation on the sna_output_attach_edid() that'll at least print another DBG. Do you mind giving that another go?
Comment 20 Chris Wilson 2012-06-02 05:39:24 UTC
Spotted that silliness, an uninitialized value on stack lead to memory corruption. Now definitely worth trying again.
Comment 21 Christoph Reiter 2012-06-02 06:39:20 UTC
Yes, worked now.

Xorg.0.log [1]: The last 30 seconds or so was clicking around and generating black and grey boxes every ~0.5 seconds.

[1] https://www.dropbox.com/s/uh6o7g8owx2waiv/Xorg.0.log.tar.xz
Comment 22 Chris Wilson 2012-06-04 03:29:47 UTC
Just so that I can be clear here, can you confirm the results with compositing:

metacity - no compositing - OK
metacity - compositing - BAD
mutter - compositing - OK

And to complete the picture, could you quickly try xfce4 with and with compositing enabled?
Comment 23 Christoph Reiter 2012-06-04 10:44:06 UTC
Correct.

With latest trunk again:

metacity - compositing - BAD
xfwm4 - compositing - BAD

metacity - no compositing - OK
mutter - compositing - OK
xfwm4 - no compositing - OK
Comment 24 Chris Wilson 2012-06-19 05:34:59 UTC
Given the number of bugs fixed to make Zdenek happy, please can you retest with the latest code?
Comment 25 Christoph Reiter 2012-06-19 07:58:26 UTC
The text corruption seems to be gone. The block artifacts are still the same.

This is on a 3.2 kernel and current trunk (0a43d425670b883b0), hope that's OK.
Comment 26 Chris Wilson 2012-07-01 11:22:18 UTC
Ok, still none the wiser for tracking down the origin of those grey rectangles - I have not yet spotted any operations that would seem to correspond. In the meantime, I have tweaked the code to hopefully reduce the amount of ping-pong being caused by Matlab, and so it should render more smoothly now. I would appreciate it if you could take the time to upload a new debug=full backtrace.
Comment 27 Christoph Reiter 2012-07-01 12:29:00 UTC
https://www.dropbox.com/s/t95itrndqqjal09/Xorg.0.log.xz

With debug full I couldn't get the big boxes, but the last 5-10 seconds have some redraw problems combined with lots of debug output (and freezes because the disk couldn't keep up writing).

Without debug I get the same as above + the big boxes all over the place.

Also this is the second time I'm writing this since Xorg just segfaulted (in intel_drv.so)...
Comment 28 Chris Wilson 2012-07-01 13:14:57 UTC
Do you mind pasting the contents of /sys/kernel/debug/dri/0/i915_fbc_status and also the backtrace of the recent crash (if you can translate the addresses using addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so 0x... that would be most useful)?
Comment 29 Christoph Reiter 2012-07-01 14:29:52 UTC
"FBC disabled: disabled per module param (default off)"

The Xorg.log is gone now.. I'll remember it for the next time.
Comment 30 Chris Wilson 2012-07-02 10:47:31 UTC
I tweaked the heuristics for CopyArea to hopefully avoid the spurious upload in your trace.

commit e7b31b6d0a32f76db4a8aef64c77d4afe808fb6c
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Jul 2 14:01:36 2012 +0100

    sna: Consolidate CopyArea with the aim of reducing migration ping-pong

Do you mind once again updating and grabbing a new debug=full log? All the while keeping track of any regressions. :)
Comment 31 Christoph Reiter 2012-07-04 14:42:06 UTC
Haven't had the time to test, but it just crashed again:

[ 46747.153] 4: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7f3ffdf25000+0x34e6e) [0x7f3ffdf59e6e]
[ 46747.153] 5: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7f3ffdf25000+0x53e4e) [0x7f3ffdf78e4e]

addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so 0x34e6e 0x53e4e

/build/src/sna/../../../src/sna/blt.c:212
/build/src/sna/../../../src/sna/sna_accel.c:3219
Comment 32 Chris Wilson 2012-07-04 14:52:27 UTC
Which checkout? There were a few regressions from e7b31b6d.
Comment 33 Christoph Reiter 2012-07-04 14:55:50 UTC
current HEAD. I can reproduce it now, here is the log with debug full:

https://www.dropbox.com/s/h8o61fo2xiymd4m/Xorg-crash.xz
Comment 34 Chris Wilson 2012-10-21 18:09:10 UTC
I can hope one of the many fixes since then has helped...

Christoph, can you still reproduce? Do you mind generating a new debug log?
Comment 35 Christoph Reiter 2012-10-24 11:28:37 UTC
Can't reproduce anymore. Thanks.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.