Bug 70924 - Xorg memory leak with >=xf86-video-intel-2.99.901 and above
Summary: Xorg memory leak with >=xf86-video-intel-2.99.901 and above
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-27 17:04 UTC by Ognian Tenchev
Modified: 2013-11-11 14:18 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
xresetop (16.00 KB, text/plain)
2013-10-27 17:24 UTC, Ognian Tenchev
no flags Details
beore (51.22 KB, image/png)
2013-10-28 20:17 UTC, Ognian Tenchev
no flags Details
after (50.97 KB, image/png)
2013-10-28 20:17 UTC, Ognian Tenchev
no flags Details
after2 (47.07 KB, image/png)
2013-10-28 20:18 UTC, Ognian Tenchev
no flags Details
after3 (3.08 KB, image/png)
2013-10-28 20:18 UTC, Ognian Tenchev
no flags Details
gkrellm artefacts (9.78 KB, image/png)
2013-10-28 20:22 UTC, Ognian Tenchev
no flags Details
dmesg after X was killed because out of memory (35.80 KB, text/plain)
2013-10-31 16:53 UTC, Ognian Tenchev
no flags Details
Xorg.log after crash (78.67 KB, text/plain)
2013-10-31 16:54 UTC, Ognian Tenchev
no flags Details
half OK image (1.74 MB, image/png)
2013-11-01 16:34 UTC, Ognian Tenchev
no flags Details
Xorg.log after crash (24.29 KB, text/plain)
2013-11-01 21:41 UTC, Ognian Tenchev
no flags Details
Xorg.log after crash (24.24 KB, text/plain)
2013-11-02 19:17 UTC, Ognian Tenchev
no flags Details
gdm log (2.16 KB, text/plain)
2013-11-02 23:39 UTC, Ognian Tenchev
no flags Details
gdm log with full debug (57.97 KB, text/plain)
2013-11-04 11:55 UTC, Ognian Tenchev
no flags Details
Xorg log with full debug (56.30 KB, text/plain)
2013-11-04 11:55 UTC, Ognian Tenchev
no flags Details
gdm log after X oom (12.33 KB, text/plain)
2013-11-04 19:46 UTC, Ognian Tenchev
no flags Details
Xorg log after update to dc61705a6e425952de4c81c2320382af07cf948a (61.18 KB, text/plain)
2013-11-05 11:28 UTC, Ognian Tenchev
no flags Details
gdm log after update to dc61705a6e425952de4c81c2320382af07cf948a (40.27 KB, text/plain)
2013-11-05 11:29 UTC, Ognian Tenchev
no flags Details
firefox titlebar (16.10 KB, image/png)
2013-11-05 13:37 UTC, Ognian Tenchev
no flags Details
wrong image (148.44 KB, image/png)
2013-11-05 13:46 UTC, Ognian Tenchev
no flags Details
top of the image is displayed wrong (326.09 KB, image/png)
2013-11-05 13:49 UTC, Ognian Tenchev
no flags Details
large image scaled with distortion (944.03 KB, image/png)
2013-11-06 10:47 UTC, Ognian Tenchev
no flags Details
gdm log after X crash (2.12 KB, text/plain)
2013-11-06 15:36 UTC, Ognian Tenchev
no flags Details
no full image displayed (680.66 KB, image/png)
2013-11-06 16:02 UTC, Ognian Tenchev
no flags Details
another part of image is missing (686.88 KB, image/png)
2013-11-06 16:03 UTC, Ognian Tenchev
no flags Details
and some black rectangles (1.12 MB, image/png)
2013-11-06 16:03 UTC, Ognian Tenchev
no flags Details
orphaned symbols on Geany editor (41.29 KB, image/png)
2013-11-08 17:43 UTC, Ognian Tenchev
no flags Details

Description Ognian Tenchev 2013-10-27 17:04:17 UTC
Somewhere between 
xf86-video-intel-2.21.15
and
xf86-video-intel-2.99.901
my Xorg start to use a lot of RAM. With 2.21 and bellow RES memory stay bellow 30M (22-26MB), but with 99.901 and above RES memory starting to climb and can reach 500-600MB (may be more but I restart X at that point).

It happens when I open big JPEG image in Firefox - something > 4000-500px width. Actually with 99.901 large images get distorted after half height, with 905 it's OK except memory :) Sometimes it drops around 100MB, sometimes just eat more and more RAM.

00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03)

I don't use xorg.conf
Comment 1 Chris Wilson 2013-10-27 17:06:12 UTC
xrestop?
Comment 2 Ognian Tenchev 2013-10-27 17:10:01 UTC
(In reply to comment #1)
> xrestop?

you need xrestop with 99.905?
Comment 3 Chris Wilson 2013-10-27 17:14:52 UTC
I need xrestop and /sys/kernel/debug/dri/0/i915_gem_objects from the supposed leak.
Comment 4 Ognian Tenchev 2013-10-27 17:24:24 UTC
(In reply to comment #3)
> I need xrestop and /sys/kernel/debug/dri/0/i915_gem_objects from the
> supposed leak.

cat /sys/kernel/debug/dri/0/i915_gem_objects
416 objects, 373006336 bytes
62 [62] objects, 112177152 [112177152] bytes in gtt
  0 [0] active objects, 0 [0] bytes
  62 [62] inactive objects, 112177152 [112177152] bytes
199 unbound objects, 57036800 bytes
0 purgeable objects, 0 bytes
23 pinned mappable objects, 24096768 bytes
16 fault mappable objects, 48234496 bytes
268435456 [268435456] gtt total

X: 414 objects, 372871168 bytes (0 active, 43479040 inactive, 57036800 unbound)
xfwm4: 0 objects, 0 bytes (0 active, 0 inactive, 0 unbound)
Comment 5 Ognian Tenchev 2013-10-27 17:24:49 UTC
Created attachment 88187 [details]
xresetop
Comment 6 Chris Wilson 2013-10-27 17:33:09 UTC
The number of objects here does not seem disproportionate to the amount of pixmaps allocated by the clients.
Comment 7 Ognian Tenchev 2013-10-27 17:37:04 UTC
(In reply to comment #6)
> The number of objects here does not seem disproportionate to the amount of
> pixmaps allocated by the clients.

I have no idea what that mean but RES for X now says 400MB:
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                         
 4155 root      20   0  489552 400068 385772 S   4.8 12.9   1:05.22 X                                                             

And I can see for example in LibreOffice missing background from highlighted menu :) And in Geany text is displayed again after last row :D
Comment 8 Ognian Tenchev 2013-10-28 19:38:37 UTC
hm ... I switch to UXA with 2.99.905 in xorg.conf and now RES Xorg memory stays around 22-26MB just like 2.21.15.

2.99.x use SNA by default, 2.21.x with UXA by default may be ... but why there is so much difference between UXA and SNA in Xorg memory usage? And why my SNA just keep eat RAM ...

Today all day working with SNA, RES Xorg memory was around 80MB but I never open large JPEG. Then open one large JPEG in Firefox and memory jump to 300MB and keep growing to 500MB ...
Comment 9 Chris Wilson 2013-10-28 19:42:34 UTC
Because the system doesn't report the memory that UXA uses to the process, whereas SNA uses CPU mappings of the bo that do show up in RES.

cat /sys/kernel/debug/dri/0/i915_gem_objects
Comment 10 Ognian Tenchev 2013-10-28 20:17:03 UTC
OK - Thanks for explanation. 

But I still think that there is something broken. 

I switch back to SNA in xorg.conf and:

# SNA - before
cat /sys/kernel/debug/dri/0/i915_gem_objects
481 objects, 185946112 bytes
245 [241] objects, 248385536 [245698560] bytes in gtt
  0 [0] active objects, 0 [0] bytes
  245 [241] inactive objects, 248385536 [245698560] bytes
165 unbound objects, 60375040 bytes
36 purgeable objects, 25972736 bytes
23 pinned mappable objects, 24096768 bytes
35 fault mappable objects, 22155264 bytes
268435456 [268435456] gtt total

X: 479 objects, 185810944 bytes (0 active, 123117568 inactive, 60375040 unbound)
xfwm4: 0 objects, 0 bytes (0 active, 0 inactive, 0 unbound)

Then I start firefox and open this image: http://dna-bucket-dna.cf.rawcdn.com/files/116016/original/_14P2868.jpg

# SNA - after
cat /sys/kernel/debug/dri/0/i915_gem_objects
362 objects, 208642048 bytes
57 [57] objects, 106934272 [106934272] bytes in gtt
  0 [0] active objects, 0 [0] bytes
  57 [57] inactive objects, 106934272 [106934272] bytes
127 unbound objects, 33849344 bytes
0 purgeable objects, 0 bytes
23 pinned mappable objects, 24096768 bytes
10 fault mappable objects, 41943040 bytes
268435456 [268435456] gtt total

X: 360 objects, 208506880 bytes (0 active, 43421696 inactive, 33849344 unbound)
xfwm4: 0 objects, 0 bytes (0 active, 0 inactive, 0 unbound)

Now I start to see artifacts on my screen. I will attach them in this bug.
before - LibreOffice normal menu before opening large image in firefox
after - same menu but after opening large image in firefox (missing background on selected item)
afert2 - there are two rows of text bellow actual text on the bottom - Geany editor :)
after3- buttons on LibreOffice dialog with strange borders ...

So something clearly broke after I open large image ...
Comment 11 Ognian Tenchev 2013-10-28 20:17:32 UTC
Created attachment 88250 [details]
beore
Comment 12 Ognian Tenchev 2013-10-28 20:17:50 UTC
Created attachment 88251 [details]
after
Comment 13 Ognian Tenchev 2013-10-28 20:18:08 UTC
Created attachment 88252 [details]
after2
Comment 14 Ognian Tenchev 2013-10-28 20:18:25 UTC
Created attachment 88253 [details]
after3
Comment 15 Ognian Tenchev 2013-10-28 20:22:08 UTC
Created attachment 88254 [details]
gkrellm artefacts

this is gkrellm which also show something wrong ...
Comment 16 Chris Wilson 2013-10-30 15:39:04 UTC
I've spent a couple of days valgrinding X running various workloads and not found a leak.
Comment 17 Ognian Tenchev 2013-10-30 15:58:06 UTC
(In reply to comment #16)
> I've spent a couple of days valgrinding X running various workloads and not
> found a leak.

OK. Let's assume that there is no memory leak. 

What about screen corruption then? I have no problems with screen artefacts until open large image.

Actually my X is running for a day now and memory stay around 90MB with SNA. But then again I never open large image. If I open only one then memory starts to grow and artefacts are back ... I have to reset X to get rid of screen corruption.

I will switch to UXA now. And may be some day ... it will be fixed :)
Comment 18 Chris Wilson 2013-10-30 16:11:13 UTC
My gut feeling was that they were GPU allocation failures, which we handle by dropping the operation (though it should also try to fallback to using the CPU) rather than crashing.
Comment 19 Ognian Tenchev 2013-10-30 16:16:27 UTC
Weird thing is that X don't crash and there is no error anywhere ... dmesg show nothing, Xorg.log show nothing.

Thanks for the response btw
Comment 20 Ognian Tenchev 2013-10-30 17:57:37 UTC
OK this seems to be Firefox specific problem with SNA.

I'm using now google chrome with SNA and 905 and can open as much as I like (actually 5 but ...) large images (>5000px wide) and X never use more than 50MB RES memory.

(btw driver with last git ed282456240cc0a7ae9a235ea8aea14a8b8a54ef corrupt xfwm4 title bars)
Comment 21 Chris Wilson 2013-10-30 18:15:05 UTC
The difference there is that firefox stores images in X and uses GPU acceleration, Chromium does not.
Comment 22 Chris Wilson 2013-10-30 18:56:10 UTC
And the recent corruption in the titlebar is from:

commit c6b0e3fe0c299488932ba0392847f1faf298d079
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Oct 30 11:52:05 2013 +0000

    sna: Detect and handle mi recursion

Now reverted and the issue with the potential recursion fixed differently.
Comment 23 Chris Wilson 2013-10-31 14:00:08 UTC
What's the current status? Are you able to reproduce the apparent leak or any of the corruption on the latest tip of xf86-video-intel [82e6d41c2f4f343bd1854d3d8ee4b624b5d68971] ?
Comment 24 Ognian Tenchev 2013-10-31 14:28:12 UTC
(In reply to comment #23)
> What's the current status? Are you able to reproduce the apparent leak or
> any of the corruption on the latest tip of xf86-video-intel
> [82e6d41c2f4f343bd1854d3d8ee4b624b5d68971] ?

I made checkout after comment 22 and borders are OK. I can't comment leak (and other corruptions) because I don't open large images since I need to finish my work :) But later tonight I will test again with Firefox because I'm pretty sure that problem don't exists with Chrome. Also I remove acceleration in Firefox, so I will report later.
Comment 25 Ognian Tenchev 2013-10-31 16:04:16 UTC
OK I have some news. Update to 82e6d41c2f4f343bd1854d3d8ee4b624b5d68971 and start use Firefox to open large images. So X memory goes up to 400+ MB. And then X crash :) But until that time no corruption so it's better than before - I can open more than one image before crash with now corruption.

Then I try again. Open five or six images with Firefox then X freeze for couple of seconds. Mouse not move, but I can switch to console. Then I back to X it was alive and no crashed. But background was messed up with white and silver rows :) Also just one window have same border like with ed282456240cc0a7ae9a235ea8aea14a8b8a54ef. Memory grows up, but after closing Firefox it drops to around 100MB. 

I can't see any error messages in dmesg or Xorg.log. I remember that some time ago I see errors in dmesg when X crashes but now no massage - nothing.
Comment 26 Chris Wilson 2013-10-31 16:09:19 UTC
(In reply to comment #25)
> OK I have some news. Update to 82e6d41c2f4f343bd1854d3d8ee4b624b5d68971 and
> start use Firefox to open large images. So X memory goes up to 400+ MB. And
> then X crash :) But until that time no corruption so it's better than before
> - I can open more than one image before crash with now corruption.

That's even worse. Please give me a backtrace for the crash.
Comment 27 Ognian Tenchev 2013-10-31 16:14:27 UTC
(In reply to comment #26)
> (In reply to comment #25)
> > OK I have some news. Update to 82e6d41c2f4f343bd1854d3d8ee4b624b5d68971 and
> > start use Firefox to open large images. So X memory goes up to 400+ MB. And
> > then X crash :) But until that time no corruption so it's better than before
> > - I can open more than one image before crash with now corruption.
> 
> That's even worse. Please give me a backtrace for the crash.

I can follow the steps to produce backtrace, but I first need to know them :) Sorry, but I'm just a user here ... and give some time first. I messed with some settings in Firefox yesterday about images and will revert them back before confirm X crash for sure. May be it's my bad ...
Comment 28 Chris Wilson 2013-10-31 16:20:48 UTC
X should never, ever crash. If it does, please grab the Xorg.0.log and file a bug report.
Comment 29 Ognian Tenchev 2013-10-31 16:52:25 UTC
haha :) It's crash for sure. I restore Firefox settings but this time X freeze, and I can't switch to console. Mouse actually working but very very slowly ... I just leave laptop and after a while X crash, but actually it was killed because it use too much RAM :)

I will attach dmesg and Xorg.log
Comment 30 Ognian Tenchev 2013-10-31 16:53:44 UTC
Created attachment 88415 [details]
dmesg after X was killed because out of memory
Comment 31 Ognian Tenchev 2013-10-31 16:54:13 UTC
Created attachment 88416 [details]
Xorg.log after crash
Comment 32 Chris Wilson 2013-10-31 21:10:09 UTC
Oh dear, looks like it recursed unto it itself. Can you please run 'addr2line -i -e /usr/lib/xorg/modules/drivers/intel_drv.so 0x1f420 0x3c4f9 0x41133 0x71a65 0x411ca'
Comment 33 Ognian Tenchev 2013-10-31 22:04:58 UTC
addr2line -i -e /usr/lib/xorg/modules/drivers/intel_drv.so 0x1f420 0x3c4f9 0x41133 0x71a65 0x411ca
??:0
??:0
??:0
??:0
??:0
Comment 34 Chris Wilson 2013-11-01 09:30:34 UTC
Sigh. No debug symbols. Please compile with debug symbols and avoid stripping on installation.
Comment 35 Ognian Tenchev 2013-11-01 11:13:42 UTC
I build 82e6d41c2f4f343bd1854d3d8ee4b624b5d68971 with --enable-debug but X don't even start. 

gdm login screen is up and then when desktop is shown X crash and login screen again is shown. No information about crash in Xorg.log. 

With startx I see glimpse ot desktop then black screen and can't even switch to console. Just black screen. I have to reboot at this stage with magic sysrq because ctrl+alt+del not working.

I will try again later to build with debug - I'm little busy right now .. sorry
Comment 36 Chris Wilson 2013-11-01 11:22:33 UTC
Probably the assertion failures fixed with

commit 6b1a6f32179f7bff8503c6b8b38351a7cf1d08b7
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Nov 1 10:48:06 2013 +0000

    sna: Scale uses of aperture_mappable by PAGE_SIZE
    
    After converting aperture_mappable to count in pages, there were a few
    residual users expecting a byte count.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71117
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

which is also a good candidate to explain the recursion you hit.
Comment 37 Ognian Tenchev 2013-11-01 15:11:29 UTC
I update to 5da329735ca79517a326aee002685bf33e8db861

Driver is now build with --enable-debug. X is running this time, but crash after I open couple of large images - not in same time. Open one - close tab, open another one close and so on. X crash first time when open first image, then after third ... just random. 

Strange thing is that there is no error in Xorg.log.
Comment 38 Chris Wilson 2013-11-01 15:14:37 UTC
More likely an assertion failure in that case: check stderr, often captured in /var/log/gdm/:0.log or similar.
Comment 39 Ognian Tenchev 2013-11-01 15:26:19 UTC
last row from gdm log:
X: kgem.c:333: __kgem_bo_map__gtt: Assertion `kgem_bo_can_map(kgem, bo)' failed.
Comment 40 Chris Wilson 2013-11-01 15:31:34 UTC
Hmm, you can safely delete that assertion. It's purpose is to warn me of dubious mappings - the kernel will reject it if it truly is unmappable.
Comment 41 Ognian Tenchev 2013-11-01 15:34:52 UTC
I have no problem with it, but my X will not stop to crash in that way :) I just don't find anything other than that:

:0.log.1:
Initializing built-in extension XFree86-VidModeExtension
Initializing built-in extension XFree86-DGA
Initializing built-in extension XFree86-DRI
Initializing built-in extension DRI2
Loading extension GLX
X: kgem.c:333: __kgem_bo_map__gtt: Assertion `kgem_bo_can_map(kgem, bo)' failed.

If I can make something to help more - welcome :)
Comment 42 Chris Wilson 2013-11-01 16:03:27 UTC
Found one candidate that could trigger your assertion:

commit 6cb84c8d55f2f7cbb087a479c1dbc8bc58e97183
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Nov 1 15:57:56 2013 +0000

    sna: Guard the replace-with-xor fallback path
    
    Before attempting to map the destination for uploading into after a
    failure to use the BLT, we need to recheck that it is indeed mappable.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=70924
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 43 Ognian Tenchev 2013-11-01 16:29:00 UTC
sorry ... it crash again with 6cb84c8d55f2f7cbb087a479c1dbc8bc58e97183

now I see on Xorg.log this:
[  3714.087] batch[3/0]: 58 58 65528, nreloc=14, nexec=5, nfence=2, aperture=52570, fenced=32768, high=24576: errno=28
[  3714.087] exec[0] = handle:518, presumed offset: a255000, size: 35999744, tiling 1, fenced 1, snooped 0, deleted 0
[  3714.087] exec[1] = handle:516, presumed offset: 4000000, size: 35999744, tiling 1, fenced 1, snooped 0, deleted 0
[  3714.087] exec[2] = handle:520, presumed offset: 3e00000, size: 71663616, tiling 0, fenced 1, snooped 0, deleted 0
[  3714.087] exec[3] = handle:521, presumed offset: 2200000, size: 71663616, tiling 0, fenced 1, snooped 0, deleted 0
[  3714.087] exec[4] = handle:3, presumed offset: 6dd000, size: 4096, tiling 0, fenced 0, snooped 0, deleted 0
[  3714.087] reloc[0] = pos:16, target:0, delta:0, read:2, write:2, offset:a255000
[  3714.087] reloc[1] = pos:28, target:1, delta:0, read:2, write:0, offset:4000000
[  3714.087] reloc[2] = pos:48, target:2, delta:0, read:2, write:2, offset:3e00000
[  3714.087] reloc[3] = pos:60, target:0, delta:0, read:2, write:0, offset:a255000
[  3714.087] reloc[4] = pos:80, target:2, delta:0, read:2, write:2, offset:3e00000
[  3714.087] reloc[5] = pos:92, target:3, delta:0, read:2, write:0, offset:2200000
[  3714.088] reloc[6] = pos:112, target:2, delta:0, read:2, write:2, offset:3e00000
[  3714.088] reloc[7] = pos:124, target:3, delta:0, read:2, write:0, offset:2200000
[  3714.088] reloc[8] = pos:144, target:2, delta:0, read:2, write:2, offset:3e00000
[  3714.088] reloc[9] = pos:156, target:3, delta:0, read:2, write:0, offset:2200000
[  3714.088] reloc[10] = pos:176, target:2, delta:0, read:2, write:2, offset:3e00000
[  3714.088] reloc[11] = pos:188, target:3, delta:0, read:2, write:0, offset:2200000
[  3714.088] reloc[12] = pos:208, target:2, delta:0, read:2, write:2, offset:3e00000
[  3714.088] reloc[13] = pos:220, target:3, delta:0, read:2, write:0, offset:2200000
[  3714.088] Aperture size 268435456, available 244338688

I get rid of gdm and use startx. this time after X crash everything goes black. Nothing on screen, no input can be done. Can't reboot with ctrl+alt+del. Just pure black screen :)
Comment 44 Ognian Tenchev 2013-11-01 16:33:35 UTC
ah I forgot ... images which I open were only half OK ... lower half was distorted ... I will attach one here
Comment 45 Ognian Tenchev 2013-11-01 16:34:01 UTC
Created attachment 88491 [details]
half OK image
Comment 46 Chris Wilson 2013-11-01 21:36:47 UTC
Please always attach the Xorg.log with the crash info - even without the debug symbols available, I can often workout where the crash is likely to be, and it helps avoid treating multiple issues as one.
Comment 47 Ognian Tenchev 2013-11-01 21:41:10 UTC
Created attachment 88515 [details]
Xorg.log after crash
Comment 48 Chris Wilson 2013-11-01 21:56:59 UTC
That we encounter ENOSPC when submitting the batch is indeed worrying and needs to be resolved, but does it actually crash after the error? It should disable acceleration (losing the batch in the process and causing corruption) but it should work thereafter.
Comment 49 Ognian Tenchev 2013-11-01 22:04:49 UTC
I encounter twice black screen with nothing on it (actually this log is from second black screen, no first one which I report early, but log looks the same). 

No mouse, can't type on keyboard for example to reboot or to restart X (I assume it is down to console), but nothing happened. No cltr+alt+backspace. So I have to reboot with magic sysrq.

May be it's not crash but lockup ... I don't know how can you name it :)
Comment 50 Chris Wilson 2013-11-02 08:45:11 UTC
Ok, that sounds like you hit a page-fault-of-doom on the fallback path. The objects are just big enough for it to try mapping both of them at once, but is unable to fit both simultaneously into the aperture. The result is that it has to swap both objects in and out of the aperture around every single byte. That is slow enough for the computer to appear to be unresponsive.

Sigh. It is meant to fallback to CPU mappings to prevent this.
Comment 51 Ognian Tenchev 2013-11-02 19:17:38 UTC
Created attachment 88534 [details]
Xorg.log after crash

today Xorg.log after crash with 6cb84c8d55f2f7cbb087a479c1dbc8bc58e97183
Comment 52 Chris Wilson 2013-11-02 20:44:57 UTC
That's very unusual and very unexpected. Hmm, trying to retrace it locally gives an invalid line number. Is there no chance you can make gentoo install the bug symbols for the ddx (as you have for the Xserver)?
Comment 53 Ognian Tenchev 2013-11-02 22:36:05 UTC
(In reply to comment #52)
> That's very unusual and very unexpected. Hmm, trying to retrace it locally
> gives an invalid line number. Is there no chance you can make gentoo install
> the bug symbols for the ddx (as you have for the Xserver)?

ddx = xf86-video-intel? 

What option I have to add add to configure? This trace is with --enable-debug to driver.
Comment 54 Chris Wilson 2013-11-02 22:45:39 UTC
Yes, ddx here is xf86-video-intel. There is no specific option, the default cflags include the debug symbols. So all that needs to be done is be sure that those flags are not overridden and the driver is not stripped upon install. If you build by hand (e.g. ./autogen --prefix=/usr && make install) it should install the debug symbols. (Unless your /bin/install strips those by default...)

I've some patches being tested that should improve the earlier symptoms (ENOSPC). Watch this space.
Comment 55 Ognian Tenchev 2013-11-02 23:15:27 UTC
ah I see "strip" ... OK I build driver without stripping it. Crashed twice but Xorg.log have no information about crash. Only in gdm I have found this:
X: /mnt/storage/tmp/portage/x11-drivers/xf86-video-intel-9999/work/xf86-video-intel-9999/src/sna/sna_blt.c:441: sna_blt_copy_one: Assertion `(src_y + height) * blt->bo[0]->pitch <= kgem_bo_size(blt->bo[0])' failed.
Comment 56 Chris Wilson 2013-11-02 23:25:03 UTC
The ENOSPC issue should be fixed now. I need a full stacktrace or debug log to be able to work out how you hit that assertion though. :|
Comment 57 Ognian Tenchev 2013-11-02 23:28:27 UTC
... and I need to know how to make that log/trace :(
Comment 58 Ognian Tenchev 2013-11-02 23:38:53 UTC
checkout to 4a7217b05c232484a80abc7bd67494996dd32057

Crash again after first open large image. Xorg.log is clean - no error message. 

gdm log finish with same message on different line:
X: /mnt/storage/tmp/portage/x11-drivers/xf86-video-intel-9999/work/xf86-video-intel-9999/src/sna/sna_blt.c:500: sna_blt_copy_one: Assertion `(src_y + height) * blt->bo[0]->pitch <= kgem_bo_size(blt->bo[0])' failed.

I will attach full log here
Comment 59 Ognian Tenchev 2013-11-02 23:39:13 UTC
Created attachment 88540 [details]
gdm log
Comment 60 Ognian Tenchev 2013-11-03 18:50:32 UTC
I have another one:
X: /mnt/storage/tmp/portage/x11-drivers/xf86-video-intel-9999/work/xf86-video-intel-9999/src/sna/sna_accel.c:3735: sna_pixmap_move_to_gpu: Assertion `priv->gpu_bo->proxy == ((void *)0)' failed.

This happened twice when I visit site which I believe have no big images on it. And sometimes I can see the web page ... sometimes X crash :)
Comment 61 Chris Wilson 2013-11-04 10:24:20 UTC
Ugh. Do you have a list of specific webpages that cause the most trouble on your machines? Not sure if I have a surviving 945gm, most of my gen3 are g33/pnv which have a different aperture and not quite so limited, or as likely to hit the same paths as your machine. :|
Comment 62 Ognian Tenchev 2013-11-04 10:50:41 UTC
Actually my X crashed twice only on this page when I visit it with Firefox:
http://www.f1fanatic.co.uk/2013/11/03/no-penalty-for-alonso-over-vergne-incident/

error was:
X: /mnt/storage/tmp/portage/x11-drivers/xf86-video-intel-9999/work/xf86-video-intel-9999/src/sna/sna_accel.c:3735: sna_pixmap_move_to_gpu: Assertion `priv->gpu_bo->proxy == ((void *)0)' failed.

I don't have any troubles with other pages. Only with big images (which I open in new tab in Firefox). And then error is:
X: /mnt/storage/tmp/portage/x11-drivers/xf86-video-intel-9999/work/xf86-video-intel-9999/src/sna/sna_blt.c:500: sna_blt_copy_one: Assertion `(src_y + height) * blt->bo[0]->pitch <= kgem_bo_size(blt->bo[0])' failed.

I'm not sure if it is only driver problem, or it is Firefox problem ...
Comment 63 Chris Wilson 2013-11-04 11:15:28 UTC
X dying here is a problem with xf86-video-intel. Afaict everything is ok atm on my gen2 devices (which are more memory & aperture constained than your gen3) and on which I was proving the aperture space checks. As I try to reproduce this locally, can you please see if you can reproduce the issues with full debugging enabled (it will dramatically slow X down and generate voluminous log files) with ./configure --enable-debug=full (I think there is also a USE option if you prefer)? The log files will be massive, but should compress well with xz.
Comment 64 Ognian Tenchev 2013-11-04 11:43:12 UTC
Do you need whole gdm and Xorg logs? 
gdm log is 372MB uncompressed, 8MB compressed
Xorg log is 8.2MB uncompressed, 250KB uncompressed
Comment 65 Chris Wilson 2013-11-04 11:53:14 UTC
The last 1000 lines should be enough... I hope.
Comment 66 Ognian Tenchev 2013-11-04 11:55:03 UTC
Created attachment 88601 [details]
gdm log with full debug
Comment 67 Ognian Tenchev 2013-11-04 11:55:54 UTC
Created attachment 88602 [details]
Xorg log with full debug
Comment 68 Chris Wilson 2013-11-04 12:59:42 UTC
Hmm, I think that should explain the half-ok image:

commit 4734354209897448af61b7c3fcb35ef1ced8b11f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Nov 4 12:57:01 2013 +0000

    sna: Apply the BLT source offset for individual copies
    
    Folloinw a complex path through multiple layers of indirections and
    tiling fallbacks, resulted in hitting a path where the source offset was
    subsequently ignored. This leads to the operation reading from invalid
    memory (or hitting the assert warning about the same).
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=70924
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks a lot for the traces. That leaves the proxy assert still unresolved, but can you please check if that at least fixes one issue?
Comment 69 Ognian Tenchev 2013-11-04 13:53:44 UTC
hm ... I can't see that commit. Last commit which I can see is: 82b646a42f5a6271c8518ad454f1603714276caf
Comment 70 Chris Wilson 2013-11-04 13:57:16 UTC
Sorry, forgot to push. Should be there now.
Comment 71 Ognian Tenchev 2013-11-04 14:13:56 UTC
OK I have good and bad news :) 
Good news is that X is still running after I opened may be 10 or more large images. Bad news is that corruption (ie menus on LibreOffice) is still there.
I will try to repeat crash on f1fanatic web site now. Problem is that yesterday is doesn't crash every time ...
Comment 72 Chris Wilson 2013-11-04 14:51:41 UTC
The libreoffice menu corruption is the same as https://bugs.freedesktop.org/attachment.cgi?id=88252 and https://bugs.freedesktop.org/attachment.cgi?id=88253 ?
Comment 73 Ognian Tenchev 2013-11-04 15:28:09 UTC
(In reply to comment #72)
> The libreoffice menu corruption is the same as
> https://bugs.freedesktop.org/attachment.cgi?id=88252 and
> https://bugs.freedesktop.org/attachment.cgi?id=88253 ?

yes
Comment 74 Ognian Tenchev 2013-11-04 19:46:12 UTC
I have a question.

Can kernel update from 3.12-rc7 to 3.12 correct corruption? Or can it be something between kernel and driver to cause corruption?

I'm still running driver from commit 8f6e227ba8127a2ca034271f2a660c24abbe056f, which before kernel upgrade and reboot produce corruption after I open large image. 

Now I can't reproduce this. After opening couple of large images (one by one - open one, close it, open another one, close it) there is no corruption.

But if I open let's  say 5 or 6 large images, X freeze and then is killed with out of memory :)
Comment 75 Ognian Tenchev 2013-11-04 19:46:40 UTC
Created attachment 88647 [details]
gdm log after X oom
Comment 76 Chris Wilson 2013-11-04 21:05:05 UTC
(In reply to comment #74)
> I have a question.
> 
> Can kernel update from 3.12-rc7 to 3.12 correct corruption? Or can it be
> something between kernel and driver to cause corruption?

3.12-rc7 to 3.12, not that I am aware of - I didn't send any fixes.
 
> I'm still running driver from commit
> 8f6e227ba8127a2ca034271f2a660c24abbe056f, which before kernel upgrade and
> reboot produce corruption after I open large image. 
> 
> Now I can't reproduce this. After opening couple of large images (one by one
> - open one, close it, open another one, close it) there is no corruption.

That sounds like an issue with memory fragmentation for a long running system, would be my first guess. And difficult to test.
 
> But if I open let's  say 5 or 6 large images, X freeze and then is killed
> with out of memory :)

Meh, still hitting recursion. Any chance you can capture that with debug symbols?
Does 'addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so -i 0x1f62d' resolve to anything useful?
Comment 77 Ognian Tenchev 2013-11-05 02:58:42 UTC
(In reply to comment #76)
> (In reply to comment #74)
> > I'm still running driver from commit
> > 8f6e227ba8127a2ca034271f2a660c24abbe056f, which before kernel upgrade and
> > reboot produce corruption after I open large image. 
> > 
> > Now I can't reproduce this. After opening couple of large images (one by one
> > - open one, close it, open another one, close it) there is no corruption.
> 
> That sounds like an issue with memory fragmentation for a long running
> system, would be my first guess. And difficult to test.

It was up for no more than a day, but I guess it was "damaged" with old driver which crashed X. Or something like that may be :)

> > But if I open let's  say 5 or 6 large images, X freeze and then is killed
> > with out of memory :)
> 
> Meh, still hitting recursion. Any chance you can capture that with debug
> symbols?
> Does 'addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so -i 0x1f62d'
> resolve to anything useful?

Unfortunately no (:?) ... driver is not stripped and build with debug, but not with full debug. I will build it with full debug and post last messages here like before.
Comment 78 Ognian Tenchev 2013-11-05 04:25:56 UTC
Full logs before X get killed with oom

GDM log:
http://www.jeckyll.net/gdm.log.xz

Xorg.log
http://www.jeckyll.net/Xorg.0.log.xz
Comment 79 Chris Wilson 2013-11-05 09:25:55 UTC
This should prevent it from taking the recursive path in the first place:

commit dc61705a6e425952de4c81c2320382af07cf948a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Nov 5 08:49:28 2013 +0000

    sna: Use an inplace exchange for large untiled BO
    
    On older architectures, large BO have to be untiled and so we can reuse
    an existing CPU bo by adjusting its caching mode.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=70924
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

and this should fix the recursion:

commit f3225fcb38686f3b9701725bf3a11ecf1c100c3f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Nov 5 08:38:22 2013 +0000

    sna: Be move conservative with tiling sizes for older fenced gen
    
    The older generations have stricter requirements for alignment of fenced
    GPU surfaces, so accommodate this by reducing our estimate available
    space for the temporary tile.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 80 Ognian Tenchev 2013-11-05 11:28:12 UTC
Created attachment 88685 [details]
Xorg log after update to dc61705a6e425952de4c81c2320382af07cf948a

X again was killed with oom (I think), but this time gdm can't recover and I have to reboot so not absolutely sure it was oom.
Comment 81 Ognian Tenchev 2013-11-05 11:29:13 UTC
Created attachment 88686 [details]
gdm log after update to dc61705a6e425952de4c81c2320382af07cf948a
Comment 82 Ognian Tenchev 2013-11-05 11:37:11 UTC
Also after update to dc61705a6e425952de4c81c2320382af07cf948a I start to see missing/distorted/replaced with underscore letters but can't make screenshot for now because they are repainted ... I will try harder! :)
Comment 83 Ognian Tenchev 2013-11-05 13:37:40 UTC
Created attachment 88691 [details]
firefox titlebar

distorted and after couple of second or mouse move over it is restored ok
Comment 84 Ognian Tenchev 2013-11-05 13:46:15 UTC
Created attachment 88692 [details]
wrong image

You can see wrong image on the left and with red circle correct image. I have to right click on wrong image - select view image - it is shown again wrong and then I have to click reload to see actual image. I can see couple of distortions i.e. "skewed" images and other funny "effects" also :) 
And I'm not sure if it is firefox problem or not. It is Firefox 25 from their site so ... 
Texts are sometimes also not displayed properly but it is impossible for me to make screen shot - they are redisplayed every time i bring terminal or gnome-screen shot window :)
Comment 85 Ognian Tenchev 2013-11-05 13:49:16 UTC
Created attachment 88693 [details]
top of the image is displayed wrong
Comment 86 Chris Wilson 2013-11-05 18:38:59 UTC
But definitely not oom now? Minor victories!

Now I suspect that the transformation from CPU to GPU is not entirely cache coherent. I've patched of a couple of bugs in the ddx

commit 723f17ca4f9c120be5fe667bf2c3e35c7ee687be
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Nov 5 18:36:45 2013 +0000

    sna: Submit execution on the bo before changing its caching status
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

commit 10b573c5084cabcc1bae70c8d35311fa5ec0a245
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Nov 5 18:29:46 2013 +0000

    sna: Clear snoop flag after converting from a CPU bo
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

if that doesn't fix things, there is also a possibility of a kernel bug.
Comment 87 Ognian Tenchev 2013-11-05 18:43:54 UTC
In comment 80 X was killed by oom in my view ... but if you see in logs anything different ok.

I will test these commits later since I have to finish some work ... sorry about that
Comment 88 Ognian Tenchev 2013-11-05 22:22:19 UTC
I have to be most annoying man on earth ... but X again crash:
X: /mnt/storage/tmp/portage/x11-drivers/xf86-video-intel-9999/work/xf86-video-intel-9999/src/sna/sna_accel.c:1990: _sna_pixmap_move_to_cpu: Assertion `(flags & 0x2) == 0 || priv->cpu_damage == ((void *)0)' failed.

This is with 723f17ca4f9c120be5fe667bf2c3e35c7ee687be commit.

And these are full debug logs:
http://www.jeckyll.net/X/201311060018/Xorg.0.old.xz
http://www.jeckyll.net/X/201311060018/gdm.log.xz

I'm still not sure about last commits for corruption  - have no see any until now, but I will confirm that later.
Comment 89 Chris Wilson 2013-11-06 09:07:18 UTC
Fixed the assertion failure - that could also have caused some corruption with debug disabled.

commit f2f9019bae5f6f03b5e23da759d3871fc18dd9f4
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Nov 5 22:41:06 2013 +0000

    sna: Only operate inplace if no existing CPU damage for a read


Hopefully,

commit ef842d2ceee4d1ccf8a0f8a81530dc8be8e18b44
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Nov 6 08:56:01 2013 +0000

    sna: Be more pessimistic for tiling sizes on older gen

is the right fix for the oom.
Comment 90 Ognian Tenchev 2013-11-06 10:47:29 UTC
Created attachment 88744 [details]
large image scaled with distortion

OK I have good news. 

Update to c3d5b1d8fcb1b65c35827d38bf5b309e433d0907.

1. No X crash after open/close may be 20 or more large images

2. No X lockup or OOM after opening 12 large images in same time. X RES memory climb to 500MB+ but after closing images and wait for about 10/15 second RES memory back to around 150MB.

3. Sometimes let's say 3 from 12 images I can see distortion on scaled image (see attached image). With no zoom images are OK. Sometimes it is displayed whole in black for a moment before it is full opened and then it is displayed OK when it is fully loaded. This can be Firefox issue. I use Firefox 24 from Gentoo portage. I will test with 25 from mozilla ftp site after a while.
Comment 91 Chris Wilson 2013-11-06 10:55:35 UTC
Whilst testing can you keep assertions enabled? Hopefully that will help to catch these errors earlier. 

Thanks for your patience - lets hope this is the light at the end of the tunnel.
Comment 92 Ognian Tenchev 2013-11-06 13:02:05 UTC
"Whilst testing can you keep assertions enabled?" how?

Firefox 25 it now playing nice with me. X again was killed with OOM.
full debug:
http://www.jeckyll.net/X/201311061457/Xorg.0.log.old.xz
http://www.jeckyll.net/X/201311061457/gdm.log.xz

It is strange because 24 was ok ... I will try to build 25 tonight on my pc and will test again may be tomorrow.
Comment 93 Chris Wilson 2013-11-06 13:05:38 UTC
(In reply to comment #92)
> "Whilst testing can you keep assertions enabled?" how?

Default to using --enable-debug when you can risk X crashing.

> Firefox 25 it now playing nice with me. X again was killed with OOM.
> full debug:
> http://www.jeckyll.net/X/201311061457/Xorg.0.log.old.xz
> http://www.jeckyll.net/X/201311061457/gdm.log.xz

Thanks. :|
Comment 94 Chris Wilson 2013-11-06 15:06:28 UTC
I've pushed yet another workaround to try and prevent falling down into the blackhole - but I haven't spotted exactly what is wrong there, though oddities abound.

commit ae380a960df6b3a9714d78eb6cb42249764488ba
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Nov 6 14:51:42 2013 +0000

    sna: Use tiling BLT fallback for BLT composite operations
    
    This avoid a circuituous route through the render pathways and multiple
    levels of tiling fallbacks to accomplish the same copy.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

and

commit 7578809ddcb244ad78ebf86359b7ee2a61e27ff6
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Nov 6 13:42:27 2013 +0000

    sna: Trim create flags if tiled sizes are too large
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

commit 073465817f54507ab6b7f801c5dfab2c06f678c0
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Nov 6 13:41:39 2013 +0000

    sna: Fences are power-of-two sizes
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 95 Ognian Tenchev 2013-11-06 15:36:20 UTC
Created attachment 88764 [details]
gdm log after X crash

This was a fast crash :)

Update to 7a9c1e153a9208e8cd7680e478fde18e051beaa9, restart X, open first large image - crash :)

X: /mnt/storage/tmp/portage/x11-drivers/xf86-video-intel-9999/work/xf86-video-intel-9999/src/sna/sna_tiling.c:980: sna_tiling_blt_composite: Assertion `op->op == 1' failed.

full logs to follow shortly
Comment 97 Chris Wilson 2013-11-06 15:45:26 UTC
(In reply to comment #95)
> Created attachment 88764 [details]
> gdm log after X crash
> 
> This was a fast crash :)
> 
> Update to 7a9c1e153a9208e8cd7680e478fde18e051beaa9, restart X, open first
> large image - crash :)
> 
> X:
> /mnt/storage/tmp/portage/x11-drivers/xf86-video-intel-9999/work/xf86-video-
> intel-9999/src/sna/sna_tiling.c:980: sna_tiling_blt_composite: Assertion
> `op->op == 1' failed.

Invalid assertion deleted.
Comment 98 Ognian Tenchev 2013-11-06 16:02:31 UTC
Created attachment 88765 [details]
no full image displayed

X is still up! :) X don't OOM! :)

OK but now images are displayed half, or big portion of them is missing. Strange but this is more like to happen on Firefox 25. Firefox 24 is managed to display them more often but it fails to ...
Comment 99 Ognian Tenchev 2013-11-06 16:03:07 UTC
Created attachment 88766 [details]
another part of image is missing
Comment 100 Ognian Tenchev 2013-11-06 16:03:40 UTC
Created attachment 88767 [details]
and some black rectangles
Comment 101 Chris Wilson 2013-11-06 16:11:36 UTC
Can you please send me a full debug loading a few images? If you can capture a failure that would be useful as well - but not essential as I don't expect that it will be immediately apparent in the logs, except perhaps as notable by its absence.
Comment 102 Ognian Tenchev 2013-11-06 16:15:27 UTC
Sure. I will do full debug log later, but how can I capture failure?
Comment 103 Chris Wilson 2013-11-06 16:18:53 UTC
(In reply to comment #102)
> Sure. I will do full debug log later, but how can I capture failure?

Just a debug log from the session containing the rendering error. Not essential, but it might have a clue. At the moment, I just want to check through and make sure that the code does actually behave the way I think it should... As you have probably noticed, these paths are quite tricky as we are trying to work on images larger than the GPU can naively handle.
Comment 104 Ognian Tenchev 2013-11-06 17:43:48 UTC
Full debug logs:
http://www.jeckyll.net/X/201311061908/Xorg.log.xz
http://www.jeckyll.net/X/201311061908/gdm.log.xz

BUT ... may be because X with full debug is slow images are displayed correctly. 

Actually I can see how they are drawn: first top half of an image then some delay, and then bottom half is displayed. 

With no debug (and fast X) images are drawn in similar manner but some portions are just not drawn. These portions are often under mouse cursor and if I move mouse over image while it is still loading there is a big chance to miss portion of image. 

With full debug I try to move mouse over, to open next image fast while first is still loading, but can't reproduce missing portion :(

I try with both firefox 25, and firefox 24 and can't reproduce same black missing portions of image with full debug. 

Then I restart X and make another try and again can't reproduce. Above logs are from first attempt. 

There are from second attempt:
http://www.jeckyll.net/X/201311061908/Xorg2.log.xz
http://www.jeckyll.net/X/201311061908/gdm2.log.xz
Comment 105 Chris Wilson 2013-11-07 22:02:40 UTC
So interesting we don't hit the new tiling shortcut paths in the full-debug logs. Though due to the earlier assertion we know that that do get utilized. So either there is a residual resource issue that causes transient rendering to be dropped, or the new paths have a bug. I consider both quite likely.
Comment 106 Chris Wilson 2013-11-08 09:57:34 UTC
One very minor tweak as I noticed in your traces that upload buffers were being retained for longer than intended:

commit 84d667b94a97ad5fde68d730d57a19e1f4241ed5
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Nov 8 08:53:55 2013 +0000

    sna: Always schedule upload buffers for retirement after use
    
    Even if they are multiply referenced due to cached references.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 107 Ognian Tenchev 2013-11-08 17:43:11 UTC
Created attachment 88902 [details]
orphaned symbols on Geany editor

I have run b796c33411218aeaf4daaeff41a1bc442b5f945f for some time now and have to say that it don't crash and don't OOM. 

I have image corruption, but I make a mistake and run kernel 3.10 which I tough is 3.12 (grub config error by my side) :( 

So when I ask what changed between 3.12-rc7 and 3.12 actually I was running 3.10 and not 3.12! 

I found mistake just this morning. So today all day I was running 3.12 and I don't see image corruption. 

I just find orphaned symbols in Geany editor. You can see Screenshot. If you are interested I can try to make again full debug logs with Geany editor. May be this time I can simulate corruption.

Now I update to abf1a16914d993cc150005879375d4bb17fdccf3 - still orphaned symbols in Geany editor.
Comment 108 Chris Wilson 2013-11-08 17:55:09 UTC
(In reply to comment #107)
> Created attachment 88902 [details]
> orphaned symbols on Geany editor
> 
> I have run b796c33411218aeaf4daaeff41a1bc442b5f945f for some time now and
> have to say that it don't crash and don't OOM. 
> 
> I have image corruption, but I make a mistake and run kernel 3.10 which I
> tough is 3.12 (grub config error by my side) :( 
> 
> So when I ask what changed between 3.12-rc7 and 3.12 actually I was running
> 3.10 and not 3.12! 
> 
> I found mistake just this morning. So today all day I was running 3.12 and I
> don't see image corruption. 

Let me just clarify:

3.10 - corruption
3.12 - no corruption

There was definitely one major corruption fixed in late 3.10 / 3.11. So if I think you just rediscovered that bug.


> I just find orphaned symbols in Geany editor. You can see Screenshot. If you
> are interested I can try to make again full debug logs with Geany editor.
> May be this time I can simulate corruption.
> 
> Now I update to abf1a16914d993cc150005879375d4bb17fdccf3 - still orphaned
> symbols in Geany editor.

Is this 3.12 or 3.10?
Comment 109 Ognian Tenchev 2013-11-08 22:47:56 UTC
(In reply to comment #108)
> (In reply to comment #107)
> > Created attachment 88902 [details]
> > orphaned symbols on Geany editor
> > 
> > I have run b796c33411218aeaf4daaeff41a1bc442b5f945f for some time now and
> > have to say that it don't crash and don't OOM. 
> > 
> > I have image corruption, but I make a mistake and run kernel 3.10 which I
> > tough is 3.12 (grub config error by my side) :( 
> > 
> > So when I ask what changed between 3.12-rc7 and 3.12 actually I was running
> > 3.10 and not 3.12! 
> > 
> > I found mistake just this morning. So today all day I was running 3.12 and I
> > don't see image corruption. 
> 
> Let me just clarify:
> 
> 3.10 - corruption
> 3.12 - no corruption
> 
> There was definitely one major corruption fixed in late 3.10 / 3.11. So if I
> think you just rediscovered that bug.

Yes

So far 3.12 show images correctly. These half displayed images and black rectangles on them were with 3.10. I feel really stupid about this mistake by my side ...

> > I just find orphaned symbols in Geany editor. You can see Screenshot. If you
> > are interested I can try to make again full debug logs with Geany editor.
> > May be this time I can simulate corruption.
> > 
> > Now I update to abf1a16914d993cc150005879375d4bb17fdccf3 - still orphaned
> > symbols in Geany editor.
> 
> Is this 3.12 or 3.10?

3.12, but these symbols were here in 3.10 also.
Comment 110 Chris Wilson 2013-11-11 14:18:12 UTC
As a rough guess the geany issue is the bug 71191 oddity. So assuming that everything else is finally working, lets concentrate the remaining discussion there.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.