Bug 42718 - Audacious' track-info text garbled after resizing window
Summary: Audacious' track-info text garbled after resizing window
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Chris Wilson
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-08 10:45 UTC by Clemens Eisserer
Modified: 2012-01-23 03:24 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
screenshot (23.32 KB, image/png)
2011-11-08 10:45 UTC, Clemens Eisserer
no flags Details
screenshot with manual flush (329.63 KB, image/png)
2011-11-08 13:48 UTC, Clemens Eisserer
no flags Details
audacious text with manual flush (5.29 KB, image/png)
2011-11-08 13:59 UTC, Clemens Eisserer
no flags Details
small stand-alone testcase (1.36 KB, text/plain)
2012-01-20 14:04 UTC, Clemens Eisserer
no flags Details

Description Clemens Eisserer 2011-11-08 10:45:25 UTC
Created attachment 53301 [details]
screenshot

When resizing Audacious main window, sometimes the track info text is garbled, as shown on the screenshot.

This is a bit hard to trigger, on my system I saw it after switching virtual desktops, but it seems reproduceable when resizing audacious' main window, probably minimizing/maximizing it a few times.

intel i945GM
intel-git
libdrm-2.4.27
pixman-0.23.8
linux-3.1.0
xorg 1.11.1
Comment 1 Chris Wilson 2011-11-08 11:06:52 UTC
Tried a few times, nothing yet. I would say that that was uninitialised garbage in the temporary pixmap for the string. But not sure why or how.
Comment 2 Clemens Eisserer 2011-11-08 11:30:58 UTC
ah, it seems this has something to do with the fade in/out effect of the text.

1. After starting audacious, and clicking on a file everthing works fine.
2. Now clicking another file, results in a fade effect of the info text
3. When resizing the window now, results in garbled text

If I now choose "View->Show info bar" twice, to I get a "fresh" info bar the problem doesn't occur until the fade effect was triggered.
Comment 3 Clemens Eisserer 2011-11-08 11:31:39 UTC
"clicking on a file" should have been "double clicking on a file" so that audacity starts to play it.
Comment 4 Chris Wilson 2011-11-08 11:53:53 UTC
Clemens, thanks for narrowing down the reproduction steps. I'm still not seeing the garbage, but at least now I can start looking at the steps leading up to the corruption.
Comment 5 Chris Wilson 2011-11-08 11:59:12 UTC
Xfce4 compositing on or off? (tried both)
Comment 6 Clemens Eisserer 2011-11-08 12:25:35 UTC
Composition doesn't seem to influence it, happens with compositor on and off.

Recorded a video, no idea wether its helpful: http://youtu.be/5aczJYh1Fv0
Comment 7 Chris Wilson 2011-11-08 12:33:18 UTC
Ta, that was quite useful actually. It tells me the bug is persistent and occurs every time that string is re-rendered. It should be easier to spot...

My first guess is that it is a flushing issue.

Either of

diff --git a/src/sna/sna.h b/src/sna/sna.h
index d79e9a8..e3e107e 100644
--- a/src/sna/sna.h
+++ b/src/sna/sna.h
@@ -96,8 +96,8 @@ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 #define DEBUG_NO_BLT 0
 #define DEBUG_NO_IO 0
 
-#define DEBUG_FLUSH_CACHE 0
-#define DEBUG_FLUSH_BATCH 0
+#define DEBUG_FLUSH_CACHE 1
+#define DEBUG_FLUSH_BATCH 1
 #define DEBUG_FLUSH_SYNC 0
 
 #define TEST_ALL 0

should be enough to test this theory.
Comment 8 Clemens Eisserer 2011-11-08 12:41:13 UTC
Jep, DEBUG_FLUSH_BATCH 1 solves the problem, whereas DEBUG_FLUSH_CACHE 1 doesn't help:

DEBUG_FLUSH_CACHE 1
DEBUG_FLUSH_BATCH 1 -> works

DEBUG_FLUSH_CACHE 1
DEBUG_FLUSH_BATCH 0 -> garbled
Comment 9 Chris Wilson 2011-11-08 13:05:24 UTC
Hmm.

diff --git a/src/sna/gen3_render.c b/src/sna/gen3_render.c
index c8ad209..935ca9d 100644
--- a/src/sna/gen3_render.c
+++ b/src/sna/gen3_render.c
@@ -2487,6 +2487,7 @@ gen3_render_composite(struct sna *sna,
                           NULL))
                kgem_submit(&sna->kgem);
 
+       kgem_emit_flush(&sna->kgem);
        if (kgem_bo_is_dirty(tmp->src.bo) || kgem_bo_is_dirty(tmp->mask.bo)) {
                if (tmp->src.bo == tmp->dst.bo || tmp->mask.bo == tmp->dst.bo) {
                        kgem_emit_flush(&sna->kgem);

I actually expect that to have no effect (if it does it means DEBUG_FLUSH_CACHES isn't as effective as I thought). In which case I'm actually back to square one, as flushing the batch alone suggests it can be a bug in many different hiding places.
Comment 10 Clemens Eisserer 2011-11-08 13:43:30 UTC
strange, with the explicit flush and DEBUG_FLUSH_CACHE=1 I now get corruptions all over the place - very similar to those seen only in audacious previously.
However it could be there is some kind of timing involved here, sometimes the corruptions are all over the place and sometimes everything is fine.
Screenshot attached.
Comment 11 Clemens Eisserer 2011-11-08 13:48:20 UTC
Created attachment 53306 [details]
screenshot with manual flush
Comment 12 Clemens Eisserer 2011-11-08 13:59:23 UTC
With explicit flushing and DEBUG_FLUSH_CACHE=0 I don't get full-screen corruptions, however with audacious now the corruption can be seen (sometimes) for the other strings too.
Comment 13 Clemens Eisserer 2011-11-08 13:59:52 UTC
Created attachment 53307 [details]
audacious text with manual flush
Comment 14 Chris Wilson 2011-11-08 14:19:53 UTC
As you can probably guess, I didn't expect that!

An aside, are you using a bitmap font (a1 glyphs)? Does changing glyph filtering (none, grayscale or LCD subpixel), or the font itself make a difference.

I'm wondering if this is not in fact related to the 1bpp rendering issue from earlier. :|
Comment 15 Clemens Eisserer 2011-11-08 14:21:54 UTC
No, I am using TT-Fonts with LCD antialiasing enabled.
I try a few settings and report back soon.
Comment 16 Clemens Eisserer 2011-11-08 14:24:53 UTC
yes, antialiasing settings do have an effect.

I get the garbled text rendering only with lcd aa, greyscale aa works as expected, as does font rendering without aa.
Comment 17 Chris Wilson 2011-11-08 14:38:46 UTC
I wonder then if you need:

diff --git a/src/sna/gen3_render.c b/src/sna/gen3_render.c
index c8ad209..dc07448 100644
--- a/src/sna/gen3_render.c
+++ b/src/sna/gen3_render.c
@@ -1452,6 +1452,8 @@ static void gen3_magic_ca_pass(struct sna *sna,
        DBG(("%s(%d)\n", __FUNCTION__,
             sna->render.vertex_index - sna->render.vertex_start));
 
+       OUT_BATCH(_3DSTATE_MODES_5_CMD | PIPELINE_FLUSH_RENDER_CACHE);
+
        OUT_BATCH(_3DSTATE_LOAD_STATE_IMMEDIATE_1 | I1_LOAD_S(6) | 0);
        OUT_BATCH(gen3_get_blend_cntl(PictOpAdd, TRUE, op->dst.format));
        gen3_composite_emit_shader(sna, op, PictOpAdd);

or maybe the very heavy hammer of OUT_BATCH(MI_FLUSH);
I hope all that is required is just a non-pipelined op, in which case simply emitting the drawrect again will be enough. However, I suspect that if we need it here we need a flush before every blend control update...
Comment 18 Clemens Eisserer 2011-11-08 14:46:37 UTC
unfourtunatly neither OUT_BATCH(MI_FLUSH) nore OUT_BATCH(_3DSTATE_MODES_5_CMD | PIPELINE_FLUSH_RENDER_CACHE) did work :(
Comment 19 Clemens Eisserer 2011-11-22 13:49:15 UTC
just tested recent git with linux-3.1.1 and the problem seems to be gone :)
Comment 20 Chris Wilson 2011-11-23 03:09:34 UTC
I haven't intentionally fixed this, so I expect it will reappear when we least expect it. I'm still wondering if there might be some link with the missing text in skype.
Comment 21 Clemens Eisserer 2011-12-13 12:36:56 UTC
jep, it appeared quite a few times - seems to depend on the length of the string rendered.
Comment 22 Clemens Eisserer 2012-01-20 14:04:59 UTC
Created attachment 55871 [details]
small stand-alone testcase
Comment 23 Clemens Eisserer 2012-01-20 14:06:21 UTC
I was able to reduce the problem to a small stand-alone gtk3/cairo based testcase.
Resizing the resulting window immediatly leads to garbled glyphs on my i945GM.

The culprit is CAIRO_OPERATOR_ATOP, with CAIRO_OPERATOR_OVER everything works as expected.
Comment 24 Chris Wilson 2012-01-20 14:39:54 UTC
Thanks a lot for reducing the issue to a testcase. I would have never guessed ATOP would be involved. Time to watch it die miserably.
Comment 25 Chris Wilson 2012-01-20 15:05:56 UTC
As a sanity check, this is a CA fallback path and a lot has changed along that path over the last month or so. Does the current git still fail in the same manner, or is there a slight variation to the failure?
Comment 26 Clemens Eisserer 2012-01-20 15:25:06 UTC
I tested a git-snapshot from two days ago.

Trying to start the testcase with debug=full results in an assert:

Program received signal SIGABRT, Aborted.
0x005e2416 in __kernel_vsyscall ()
(gdb) bt
#0  0x005e2416 in __kernel_vsyscall ()
#1  0x49d4998f in __GI_raise (sig=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0x49d4b2d5 in __GI_abort () at abort.c:91
#3  0x49d426a5 in __assert_fail_base (
    fmt=0x49e82c48 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x52f89c "bo->base.exec || bo->base.refcnt > 1", 
    file=0x52f03c "kgem.c", line=1239, 
    function=0x530f4a "kgem_finish_partials") at assert.c:94
#4  0x49d42757 in __GI___assert_fail (
    assertion=0x52f89c "bo->base.exec || bo->base.refcnt > 1", 
    file=0x52f03c "kgem.c", line=1239, 
    function=0x530f4a "kgem_finish_partials") at assert.c:103
#5  0x0045b083 in kgem_finish_partials (kgem=0x9e01e50) at kgem.c:1239
#6  0x0045bef8 in _kgem_submit (kgem=0x9e01e50) at kgem.c:1491
#7  0x0049d10b in kgem_bo_submit (kgem=0x9e01e50, bo=0xa0d8d88) at kgem.h:242
#8  0x0049de6e in sna_read_boxes (sna=0x9e01d18, src_bo=0xa0e4e30, src_dx=0, 
    src_dy=0, dst=0x9ff8788, dst_dx=0, dst_dy=0, box=0x9fe1480, nbox=1)
    at sna_io.c:182
#9  0x0046502d in _sna_pixmap_move_to_cpu (pixmap=0x9ff8788, flags=2)
    at sna_accel.c:920
#10 0x00465820 in sna_drawable_move_region_to_cpu (drawable=0x9ff8788, 
    region=0xbfcaff44, flags=2) at sna_accel.c:1097
---Type <return> to continue, or q <return> to quit---
#11 0x0048a751 in sna_drawable_move_to_cpu (drawable=0x9ff8788, flags=2)
    at sna.h:465
#12 0x0048c32e in sna_composite (op=9 '\t', src=0xa182498, mask=0xa1b2320, 
    dst=0xa183e40, src_x=12, src_y=72, mask_x=0, mask_y=0, dst_x=12, dst_y=72, 
    width=49, height=23) at sna_composite.c:536
#13 0x0049ac15 in glyphs_via_mask (sna=0x9e01d18, op=9 '\t', src=0xa182498, 
    dst=0xa183e40, format=0x9e12e68, src_x=12, src_y=72, nlist=0, 
    list=0xbfcb0988, glyphs=0xbfcb0590) at sna_glyphs.c:917
#14 0x0049bbe8 in sna_glyphs (op=9 '\t', src=0xa182498, dst=0xa183e40, 
    mask=0x9e12e68, src_x=10, src_y=90, nlist=1, list=0xbfcb097c, 
    glyphs=0xbfcb057c) at sna_glyphs.c:1263
#15 0x08156cff in damageGlyphs (op=9 '\t', pSrc=0xa182498, pDst=0xa183e40, 
    maskFormat=0x9e12e68, xSrc=10, ySrc=90, nlist=1, list=0xbfcb097c, 
    glyphs=0xbfcb057c) at damage.c:647
#16 0x081437b1 in CompositeGlyphs (op=9 '\t', pSrc=0xa182498, pDst=0xa183e40, 
    maskFormat=0x9e12e68, xSrc=10, ySrc=90, nlist=1, lists=0xbfcb097c, 
    glyphs=0xbfcb057c) at glyph.c:604
#17 0x08150668 in ProcRenderCompositeGlyphs (client=0xa17ff08) at render.c:1440
#18 0x08149f74 in ProcRenderDispatch (client=0xa17ff08) at render.c:2063
#19 0x08076155 in Dispatch () at dispatch.c:447
#20 0x0806439a in main (argc=6, argv=0xbfcb0e14, envp=0xbfcb0e30) at main.c:287
P
Comment 27 Chris Wilson 2012-01-20 15:56:02 UTC
Hmm. A couple of possible explanations, either I've made a mistake in the handling of the dual-personality temporary string mask (which is meant to exist both in CPU and GPU domains) or I've made a mistake in the partial buffer management and left a stale object in the cache. The first explanation would also match the corruption you see as we would be replacing valid CPU data with uninitialised GPU data and then attempt to composite.

/me slaps himself.

I forgot I need to force ARGB text.
Comment 28 Chris Wilson 2012-01-20 16:29:30 UTC
I have to declare aside from the assertion failure, the handling of CA text fallback is just horrible of the worst order... I had dreamt of getting dual-source blending enabled by now, but that is still only gm45 and above. Oh well, after I find the cause of the assertion and the corruption, I'll kill this horrible render-via-gpu only to fallback.
Comment 29 Chris Wilson 2012-01-20 17:21:59 UTC
For once, the assertion was spot on!

commit 8a91c7d85740a5adc25d2a9b1972c367780ce714
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Jan 21 01:18:17 2012 +0000

    sna: Give kgem_create_buffer_2d a unique id
    
    The gen3, among others, backend uses the unique id of a buffer to track
    the currently attached buffer and uses 0 as the invalid value. Linear
    buffers as created by kgem_create_buffer_2d were not being assigned a
    unique id causing mayhem when they were then being passed to the
    backends as render targets and sources. In particular, gen3 did not
    notice the switch in render target and did not emit commands to change
    the GPU target nor attach the buffer to the batch, causing the
    sna_read_boxes to fail and for us to trigger an assertion for an
    uncomsumed read buffer.
    
    Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42718
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

That fixes the assertion and looks like it should fix the readback along the CA fallback path. (Triggering that readback is still evil.) gtkquirk seems to be happy enough that I'm calling it a night...
Comment 30 Clemens Eisserer 2012-01-23 03:00:50 UTC
Thanks!
gtkquirk and Audacious both seem happy now :)
Comment 31 Chris Wilson 2012-01-23 03:24:00 UTC
I'm marking this as closed. Be prepared to spot the regression when I optimize the readback! ;-)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.