Bug 52299

Summary: [gm45 sna] loimpress slideshow hangs with kwin + fullscreen GL desktop effects
Product: xorg Reporter: sergio.callegari
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
xorg log
none
Test case
none
Snapshot of LO initial screen
none
Snapshot of LO writer none

Description sergio.callegari 2012-07-20 09:55:00 UTC
Hi,

I am encountering a weird behavior with latest intel drivers and SNA accel under older chipset.

Hardware:
Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller

Software:
Kubuntu linux 64bit 12.04
Kernel 3.4.6 
(from ubuntu mainline PPA)

libdrm 2.4.37 + git 120716
mesa 8.1 + git 120608
xserver-video-intel 2.19 + git 120720
(this later stuff from the oibaf graphics drivers PPA, aka Fabio Pedretti's
PPA)

This bug gets exposed by libreoffice

To reproduce:
1) Put accelmethod sna in xorg.conf
2) Run libreoffice impress, make sure that you do not have hardware acceleration on in impress, which exposes a huge number of different issues probably due to libreoffice only.
3) Play a presentation that has a fading effect between the slides

Actual result:
1) Stepping from the 1st slide to the 2nd slide is OK, so is at times the stepping to the 2nd to the 3rd
2) From a certain point on, the screen stops updating when one clicks to move to the next slide.
3) Interestingly, libreoffice /does/ move to the next slide. For instance, if the display stops updating at slide 3 and you click to move to slide 4 and then to slide 5, when you press esc to exit the full screen presentation mode, libreoffice /is/ at slide 5.

Reverting to uxa acceleration on the same driver fixes the issue.
Comment 1 Chris Wilson 2012-07-20 10:18:43 UTC
And to check, which version of cairo?
Comment 2 Chris Wilson 2012-07-20 10:20:34 UTC
Also if you can attach an example presentation, that will save me the trouble of creating one :)
Comment 3 Chris Wilson 2012-07-20 10:22:51 UTC
fading == 'Fade Smoothly'?
Comment 4 Chris Wilson 2012-07-20 10:35:24 UTC
Hmm, would also like to see your Xorg.log to clarify versions.
Comment 5 sergio.callegari 2012-07-20 10:44:18 UTC
Libcairo is 1.10.2-6.1ubuntu3 that is 1.10.2 with ubuntu patches.

WRT the sample presentation, I can try set one up for you soon, but unfortunately not right now...

The effect is 'fade through black' (once I have set up the English user interface)

xorg log in the next msg as soon as I restart it with SNA
Comment 6 sergio.callegari 2012-07-20 10:50:44 UTC
Created attachment 64423 [details]
xorg log
Comment 7 sergio.callegari 2012-07-20 10:52:07 UTC
Created attachment 64424 [details]
Test case

Lunch break.... managed putting together the short test case too.
Comment 8 Chris Wilson 2012-07-20 11:18:14 UTC
Thanks for the presentation, it just rules out something peculiar to the slideshow.

So far it has worked flawlessly for me, with kwin compositing enabled and without a compositing wm. Anything else interesting in your configuration?

I take it you are not in a position to be able to install a debug driver?
Comment 9 sergio.callegari 2012-07-20 13:15:06 UTC
My config should not be paricularly weird.

I have kde 4.8.4 and kwin with the opengl compositing on.  And I have an external screen attached to the VGA port, working with lvds off.  Libreoffice is 3.6RC1, but - as I told - it is fine with UXA.

When I finish some urgent work, I'll test without compositing.

And in the weekend, I should be able to test the xserver-xorg-intel in the debug edition, since I have noticed that Fabio Pedretti is so kind to package that too.
What should I look at?
Comment 10 Chris Wilson 2012-07-20 13:33:58 UTC
Doesn't appear to be GL effects either. And I'm using a similar setup (external DVI with LVDS off). Hmm.

And you are able to reproduce this almost at will..

If you get the chance to run with --enable-debug=full that will generate a huge logfile for me to look at and see if I can spot what is happening when it stops responding. A plain --enable-debug package turns on lots of assertions, so would also be useful to test if that's available.
Comment 11 sergio.callegari 2012-07-20 21:58:37 UTC
Made some additional tests, without debug package yet.

1) I have a video in case you want to see what happens. It's a bit big, though, so let me know if it might be useful before I upload.

2) Bug is triggered by kwin desktop effects (compositing). Without the effects, no bug.

3) Bug disappears if I configure kwin to disable desktop effects with fullscreen windows. Maybe this is your configuration.

4) Without desktop effects, or with the disable desktop effects with fullscreen windows, not only the bug disappear, but the presentation quality improves a lot becoming completely smooth. With the desktop effects on there is something looking like tearing.
Comment 12 sergio.callegari 2012-07-20 22:28:30 UTC
Can reproduce also on:

EEEPC 1000H (Atom 32 bit)
with intel 945GM express (Gen3)

Ubuntu 12.04

Similar graphics stack as the bigger brother DELL E6500 (Fabio pedretti ppa with recent git stuff), but the following differences:

1) Kernel 3.2 (stock ubuntu) instead of 3.4
2) Unity instead of KDE
3) libreoffice 3.5.4 instead of 3.6RC1

just a bit less frequent to see.
Comment 13 Chris Wilson 2012-07-20 23:00:38 UTC
(In reply to comment #11)
> Made some additional tests, without debug package yet.
> 
> 1) I have a video in case you want to see what happens. It's a bit big, though,
> so let me know if it might be useful before I upload.

If it is just the screen stops redrawing (even though the slides are advancing), then I think not, as a static image doesn't give much more information. :)
> 
> 2) Bug is triggered by kwin desktop effects (compositing). Without the effects,
> no bug.
> 
> 3) Bug disappears if I configure kwin to disable desktop effects with
> fullscreen windows. Maybe this is your configuration.

Indeed it was.

> 4) Without desktop effects, or with the disable desktop effects with fullscreen
> windows, not only the bug disappear, but the presentation quality improves a
> lot becoming completely smooth. With the desktop effects on there is something
> looking like tearing.

With debugging enabled, there is an obvious black frame between the static image and the fade from A to black to B (i.e. it goes blank, then restores the original image and fades to B).

So it looks like I the same setup now (kwin + fullscreen GL effects), but no bug as of yet...

Maybe your video will be useful after all, to try and reproduce your steps exactly.
Comment 14 Chris Wilson 2012-07-21 13:08:51 UTC
I still haven't encountered this. Any chance you can compile your own driver with debugging enabled (--enable-debug=full) and attach the Xorg.log?
Comment 15 sergio.callegari 2012-07-23 13:05:12 UTC
Sorry to ask questions that may be naive, but I have never done this before.

Do I need to rebuild the intel driver alone (namely the package xserver-xorg-video-intel on ubuntu) or that plus the xorg framework (namely xorg-server) to get the required debug info?
Comment 16 Chris Wilson 2012-07-23 13:39:38 UTC
Just the xf86-video-intel is all that is required.

Try:
$ sudo apt-get build-dep xserver-xorg-video-intel
$ git clone git://anongit.freedesktop.org/xorg/driver/xf86-video-intel
$ cd xf86-video-intel
$ ./autogen.sh --prefix=/usr --enable-debug=full
$ make && sudo make install
Comment 17 sergio.callegari 2012-07-23 14:35:21 UTC
Ok, I've followed a slightly different path.

Rebuilt the ubuntu package into a new ubuntu package with --enable-debug=full.

This means that I have rebuilt the very same package that I was using, namely the intel driver with all the git stuff up to 20/7/2012.  And it also means that I can easily re-install the package with the debug on whenever I need it.

For those intersted, my package is in the Sergio Callegari ubuntu PPA https://launchpad.net/~callegar/+archive/ppa/+packages.

The xorg log that I am attaching was obtained by starting X, starting xorg with the test presentation that is also attached here, activating the presentation, let it show the problem, closing libreoffice.

Here it is:

https://docs.google.com/open?id=0B0owM_i9wf0CSzZiSjRYOWhqXzg

I cannot attach it inside the bug tracker because it is quite big.

Does this help?
Comment 18 Chris Wilson 2012-07-23 15:13:10 UTC
Well I only see loimpress draw a single frame of that presentation (judging by the rendering commands I see locally). How many frames do you think it drew, and how many did it advance by?
Comment 19 sergio.callegari 2012-07-23 15:17:57 UTC
Started the presentation. LO made slide one appear on screen, then I clicked and nothing, then I clicked and nothing, then I clicked and nothing, then I pressed esc to exit the presentation and LO was at slide 4.

Tested again with kwin configured to disable effects on full-screen and LO is happy and draws all the slides.
Comment 20 sergio.callegari 2012-07-23 15:24:04 UTC
Tried Apache Openoffice 3.4 too.

It does almost the same. With kwin setup to disable the effects in fullscreen mode, it is happy. Whit kwin setup to leave the effects enabled in fullscreen mode it shows some slides (typically 1 or 2), then the screen stops updating. However, at this point when you press esc to exit openoffice it typically crashes.
Comment 21 sergio.callegari 2012-07-24 10:21:59 UTC
More funny bits...

I have discovered this:  if I

1) Let kwin keep the effects on in fullscreen so that the issue can manifest
2) Start libreoffice on the test presentation
3) Launch the presentation
4) Keep moving to the next slide until the bug manifests
5) When the slide does not update press ALT-F2 so that kwin gives me a little command line, use it to issue some xrandr command switching between the external monitor and the laptop screen or viceversa 

then, after the screen switch has completed, - while staying in full screen presentation mode - updates the screen gets updated to the new slide.

Do you think that this may be a bug in SNA, or maybe a bug in libreoffice that gets triggered by SNA (say, by slightly different timings on things that SNA is imposing)? Should I cross post this bug to the LO and AOO mailing lists?
Comment 22 Chris Wilson 2012-07-30 13:41:37 UTC
Hmm, I wonder...

commit e6cb5d93eaa01e7f4763f797bba341f3cc481d98
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Jul 30 11:14:58 2012 +0100

    sna: Avoid overlapping gpu/cpu damage with IGNORE_CPU
    
    We cannot simply ignore the presence of CPU damage with IGNORE_CPU but
    must remember to discard it.

Can you please do a quick test with master?
Comment 23 sergio.callegari 2012-07-30 20:18:33 UTC
Hi,

I am now running a packages xserver-xorg-video-intel which is labelled

git1207301126.d3499c

Which I guess refers to commit d3499c dated 12/07/30.  Seems pretty recent, but I do not know if it precedes or follows the commit being mentioned in the latest post.

In any case, it still has the issue.

Probably I will not be able to do any test for the next week or so... I'll post some news right after that.
Comment 24 Chris Wilson 2012-07-31 08:37:01 UTC
It was a long shot, as I still can't see where it was even trying to render the missing updates...
Comment 25 sergio.callegari 2012-08-06 19:52:54 UTC
Managed to make a quick test on an EEEPC 1000H (gen3 with Mobile 945GSE Express Integrated Graphics Controller) after receiving an updated deb package with the intel video driver post 2.20.2 dated 3/8/12 at git commit 146959.

Apparently things have regressed even further.

With UXA everything is fine as usual.

With SNA and unity 2D things are still fine, but there is a lot of flickering in the presentation effects.

With SNA and unity 3D, libreoffice is now uncapable of rendering even the first slide in presentation mode. The screen remains at some very light shade of gray with no image on it.
Comment 26 Chris Wilson 2012-08-08 10:42:51 UTC
Still not seeing the issue you describe in unity (3d); unity-2d's problems are of its own making, at least compared to either xfce4 or awesome,
Comment 27 sergio.callegari 2012-08-21 16:47:38 UTC
Started working on the gen4 machine again. Now on 2.20.4.

Lack of screen update now manifests also in other applications (e.g. firefox, thunderbird). Screen is partially rendered and the software is quiescent (no sign of it being still downloading something). CTRL+ALT+F1 go to console, CTRL+ALT+F7 back to X11 and the screen is now updated.

Do not know if it is just me having more attention in looking for the issue or if it is new in 2.20.4.
Comment 28 Chris Wilson 2012-08-21 17:58:42 UTC
This is just really frustrating. It is one of those cases where if I could spend an afternoon playing with the bug, I could fix it. But until I have the opportunity to experiment, I'm none the wiser as to where exactly it is manifesting. :|
Comment 29 sergio.callegari 2012-08-22 18:09:12 UTC
It is also frustrating for me not to be able to provide a reproducible test case...  If I can find enough time for it, I'll try building a minimal setup like mine to see if the bug is present or not.
Comment 30 sergio.callegari 2012-08-23 17:39:28 UTC
I think that apart from the LO presentation, 2.20.4 indeed has some further regression wrt 2.20.2 when using SNA acceleration and Gen4 with Kwin and effects on.  In many apps, I am now experiencing missing screen updates:

1) Editors where the pression of a key does not make the character appear, but the pression of another key make the current character and the previous one simultaneously appear.

2) Firefox where parts of the screen only get update when one moves the mouse over them or scrolls.

3) Textboxes where the text cursor disappears

Before vacation I could work on SNA without problems (apart from the LO thing), while now I need to be back on UXA.

If I can find some test case where (at least on my machine) some action can invariably trigger rendering failure, I'll open another bug for that.  So far, I am only sending this to you as a very preliminary information.
Comment 31 Chris Wilson 2012-08-23 17:56:31 UTC
Yes, a missing flush of the DRI pixmaps (for GL compositors like kwin) crept in:

commit fc6b7f564df88ca773ae245b1b4e278b47dffd59
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Aug 23 15:13:14 2012 +0100

    sna: Flush the batch if it contains any DRI pixmaps
    
    This fixes a regression from
    
    commit 02963f489b177d0085006753e91e240545933387
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Sun Aug 19 15:45:35 2012 +0100
    
        sna: Only submit the batch if flushing a DRI client bo
    
    which made the presumption that we called sna_add_flush_pixmap() for
    every DRI pixmap that we used. However, that is only called for the
    dirty pixmaps, any native exported pixmap only marks the batch as
    requiring a flush. So in those cases we always need to submit the batch
    if it contains an exported DRI pixmap.
    
    Reported-by: chr.ohm@gmx.net
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53967
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 32 Chris Wilson 2012-08-26 20:49:32 UTC
In the meantime, please do keep me informed of any testing you do. Any changes, good or bad, may give an insight as to where the root cause is.
Comment 33 sergio.callegari 2012-08-27 10:24:11 UTC
Downloaded a more recent xserver-xorg-video-intel. I am now at a version packaged on 24/8 and marked as git 454cc8 (sna: Submit the partial batch before throttling).

Missing updates on firefox/thunderbird and more that I reported recently are in fact gone.

However, I am still experiencing the missing update with the LO presentation.

Furthermore, I have a new rendering problem, again with Libreoffice and Openoffice (either the LO 3.6.0 from the libreoffice site or the Apache OO 3.4.0 from the apache site).

Again, it looks as a wrong/missing update. Icons are only partially rendered. Hovering with the mouse makes their appearance improve.

Please look at the two attachments.
Comment 34 sergio.callegari 2012-08-27 10:25:11 UTC
Created attachment 66169 [details]
Snapshot of LO initial screen

Some buttons/icons are rendered wrongly.
Comment 35 sergio.callegari 2012-08-27 10:25:54 UTC
Created attachment 66170 [details]
Snapshot of LO writer

Toolbars are rendered wrong
Comment 36 Chris Wilson 2012-08-27 10:28:25 UTC
That looks more like a missed cache flush. Quite possible since we no longer flush after every single rectangle that the original code was missing some required flushes.
Comment 37 Chris Wilson 2012-08-27 14:57:04 UTC
TIL that typing loimpress into the KDE start menu does different things to than launching it through the menu.

Haven't seen the wholescale corruption you have in your toolbar, but I am catching the odd corrupt toolbar icon (like half is uninitialised garbage), only when using kwin opengl effects. That suggests an issue along the DRI serialization path, or some missing damage. That it is not trivially reproducible lends credence to it being a timing and/or damage flush issue between the gl compositor and X.
Comment 38 sergio.callegari 2012-08-27 15:15:28 UTC
Apparently, also launching from the command line with --nologo or even with --writer as an option reduces the probability of seeing the mis-rendered icons/toolbar. Yet, at times the issue manifests even with these options.
Comment 39 Chris Wilson 2012-08-27 17:36:24 UTC
I can see the icon corruption on both 965 and 945(pineview) so it is unlikely to be anything chipset specific, i.e. not a missing flush inside the GPU. More likely than is some unpushed damaged to the compositor.
Comment 40 Chris Wilson 2012-08-27 19:55:30 UTC
Found the issue with my broken icons:

commit 26c731efc2048663b6a19a7ed7db0e94243ab30f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Aug 27 20:50:08 2012 +0100

    sna: Ensure that we create a GTT mapping for the inplace upload buffer
    
    As the code will optimistically convert a request for a GTT mapping into
    a CPU mapping if the object is still in the CPU domain, we need to
    overrule that in this case where we explicitly want to write directly
    into the GTT and furthermore keep the buffer around in an upload cache.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=51422
    References: https://bugs.freedesktop.org/show_bug.cgi?id=52299
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Can you please test and see if that clears up any of your issues?
Comment 41 sergio.callegari 2012-08-28 11:37:03 UTC
Tested...  the driver is built in my ubuntu PPA for reference.

Unfortunately the issues (misrendered icons, misrendered toolbar, presentation not advancing) are still there.
Comment 42 Chris Wilson 2012-08-28 20:42:10 UTC
(In reply to comment #41)
> Tested...  the driver is built in my ubuntu PPA for reference.
> 
> Unfortunately the issues (misrendered icons, misrendered toolbar, presentation
> not advancing) are still there.

Yeah, after more testing I came to the conclusion that is was just the placebo effect. :(
Comment 43 Chris Wilson 2012-08-28 21:27:37 UTC
Ok, this seems to be better:

commit deaa1cac269be03f4ec44092f70349ff466d59de
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Aug 28 22:23:22 2012 +0100

    sna: Align active upload buffers to the next page for reuse
    
    If we write to the same page as it already active on the GPU then
    despite the invalidation performed at the beginning of each batch, we do
    not seem to correctly sample the new data.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=51422
    References: https://bugs.freedesktop.org/show_bug.cgi?id=52299
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 44 sergio.callegari 2012-08-29 11:51:24 UTC
Yes, I confirm that the latest git fixes the wrong startup icons and the toolbar in libreoffice/openoffice!

Thank you for the quick fix!

Now we only remain with the initial problem of the Libreoffice/openoffice presentation that is not advancing if the window manager is configured to have the desktop effects and not to disable them in fullscreen mode.
Comment 45 Chris Wilson 2012-08-29 12:13:17 UTC
Meh, in reality it wasn't a quick fix, but that issue had been lurking since around January. Thanks for confirming the fix. And now back to trying to trigger the loimpress failure again.

Are all your packages, except for the gfx, stack from precise?
Comment 46 sergio.callegari 2012-08-30 20:59:48 UTC
All is from precise, but for the following:

1) Virtualbox - from virtual box deb repository
   this should not matter

2) Maxima, Xmaxima, WXMaxima - from the blahota-wxmaxima ppa
   this should not matter

3) Asymptote - from my callegar-asymptote ppa
   this should not matter

4) Samsung spp printer driver - from my cupsdrivers (oneiric) ppa
   this should not matter

5) Qxmledit - from my callegar-qxmledit (oneiric) ppa
   this should not matter

6) The Obnam backup tool and its dependencies - from chris-bigballofwax-obnam-ppa ppa
   this should not matter

7) Git from the git-core ppa
   this should not matter

8) kde 4.9 from the kubuntu-ppa-backports ppa
   this may matter

9) A few entries from the kubuntu-ppa
   this may matter - I can provide the detailed list if necessary

10) librecad from the librecad-dev-librecad-stable ppa
   this should not matter

11) the updated graphics stack from the oibaf-graphics-drivers ppa
   this surely matters

12) openshot from the openshot.developers ppa
   this should not matter

13) recoll from the recoll-backports-recoll-1.15-on ppa
   this should not matter

14) texlive 2012 from the texlive-backports ppa
   this should not matter

15) wine 1.5 from the ubuntu-wine ppa
   this should not matter

16) vala 0.16 from the vala-team ppa
   this should not matter

17) rekonq 1.0 from the yoann-laissus-rekonq ppa
   this should not matter

18) a few things that should not matter from the ubuntu official precise backports
   this should not matter

19) a couple of items from medibuntu
   this should not matter

20) xpra from winswitch
   this should not matter

21) google chrome and the talk plugin from google
   this should not matter

22) wuala from wuala
   this should not matter

23) jitsi from jitsi
   this should not matter

24) the draftsight cad (i386) from dassault systemes
   this should not matter

25) atlas libraries recompiled by myself to be optimized for my specific system
   this should not matter

26) libreoffice 3.6 from the libreoffice site
   this may matter - it is not libreoffice as distributed by ubuntu, but libreoffice from the libreoffice site

27) linux 3.5.3 from the ubuntu mainline ppa
   this may matter

28) nsp and scicoslab matlab clones from the scicoslab site
   this should not matter

29) nxclient from nomachine
   this should not matter

30) skype from skype
   this should not matter

ALL the above is in deb packages...

Furthermore I have some stuff in opt... but this really should not matter...

Nothing significant (libs) in local...

Let me know if you need more details on something...
Comment 47 sergio.callegari 2012-08-30 21:01:30 UTC
In the previous msg I forgot to mention that the machine has never been re-installed since the time of ubuntu hardy... only upgraded... so that there might be some cruft around...
Comment 48 sergio.callegari 2012-09-10 10:05:07 UTC
I think I finally have some news on this...

Looks like it is specific to libreoffice builds from libreoffice.org and openoffice builds from apache.

I tried downgrading to the ubuntu libreoffice and the presentation now works. Either with kwin having compositing on in fullscreen or not.

But libreoffice.org and apache builds do not.

I have also noticed that libreoffice and apache builds incorporate many libraries that get used in place of system libraries. Hence, probably all boils down to this. Either:

1) The libraries distributed by libreoffice/apache have some incompatibilities with the ubuntu precise mesa (but how does this happen?). Or

2) The libraries distributed by libreoffice/apache impose some slightly different timings of things and the stopping presentation is related to the timing of events.

Maybe it is 1.  As a matter of fact, I downgraded libreoffice to the ubuntu precise debs (even if they are very old and buggy), just to test since I was experimenting and I had noticed that the libreoffice from libreoffice.org cannot work with the soft renderer. Namely, LIBGL_ALWAYS_SOFTWARE=1 makes libreoffice complain that it cannot load opengl support...
Comment 49 Chris Wilson 2012-09-10 10:11:01 UTC
Can you check whether libreoffice bundles cairo in their build? In particular
there was a regression in cairo-1.12.0:

commit 9e81c5b737cda9dc539b2cf497c20ac48ddb91ac
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Apr 25 20:41:16 2012 +0100

    xlib: Allow applications to create 0x0 surfaces
    
    Although 0x0 is not a legimate surface size, we do allow applications
    the flexibility to reset the size before drawing. As we previously never
    checked the size against minimum legal constraints, applications expect
    to be able to create seemingly illegal surfaces, and so we must continue
    to provide backwards compatibility.
    
    Many thanks to Pauli Nieminen for trawling through the protocol traces,
    diving into the depths of libreoffice and identifying the regression.
    
    Fixes https://bugs.freedesktop.org/show_bug.cgi?id=49118 (presentation
    mode in loimpress is blank).
    
    Reported-by: Eric Valette <eric.valette@free.fr>
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk

The symptoms didn't seem to match completely, but maybe...
Comment 50 sergio.callegari 2012-09-10 10:26:43 UTC
libcairo.so.2 is in...

I am not sure which version. I tried strings on it and a 1.10.2 comes out that might be it.
Comment 51 sergio.callegari 2012-09-10 10:45:06 UTC
Some more info... unfortunately possibly irrelevant...

by moving the libstdc++.so.6 distributed with libreoffice (using the libreoffice.org build) out of the way (so that the system one is used) libreoffice can now use the software pipe for opengl.

With this, no hangs.

Also, I have learned that there is an opengl debug option in libreoffice: LIBGL_DEBUG=verbose. With this during the presentation libreoffice complains all the time that

glGetError is no-op
glGetError is no-op
glGetError is no-op
glGetError is no-op
glGetError is no-op

either with the soft or the hard pipe.

moving the libcairo shipped with libreoffice out of the way makes no difference.
Comment 52 sergio.callegari 2012-09-10 10:57:29 UTC
BTW... it is completely unclear to me why libreoffice is actually loading i965_dri.so or swrast_dri.so since I use it configured not to use hardware acceleration (that is anyway completely broken on linux).

As a matter of fact one more difference between the libreoffice and the ubuntu builds of libreoffice is that in the former you can configure hardware acceleration, while in the latter it is permanently disabled at compile time.
Comment 53 Chris Wilson 2012-09-13 17:29:57 UTC
Interesting and very scary that it seems to boil down to a particular version of libstdc++. This is a library only used by libreoffice and not kwin etc? At the moment, I'm leaning towards this being a libreoffice build issue and perhaps they might know a little more. Can you raise a bug report in libreoffice and cross-link?
Comment 54 sergio.callegari 2012-09-13 17:46:37 UTC
Well, not exactly...

with the system library I can run the libreoffice.org libreoffice with the llvmpipe renderer, which gives no hang in the presentation. With the libreoffice.org packaged libstdc++ I can only run the hardware renderer... which hangs during the presentations only if using sna either with the system libstdc++ and with the libreoffice.org one.

On the ubuntu libreoffice, where /all/ the libraries are system libraries, it looks like I can use sna with no hangs.

I have posted the thing to the libreoffice.org bug tracker.  I am worried that mine will look like a corner case to them...  Please see https://bugs.freedesktop.org/show_bug.cgi?id=54725

As a last note...

With the latest driver, the bug manifests in a sligtly different way on my eeepc (the gen3 machine with the unity window manager). Now the hang is not trying to pass to the next slide, but just before the rendering of the current slide is over (namely I often get the background forming through the slow gradient effect, and then the presentation hanging before all the text is rendered onto the slide). Unless something relevant has changed in the drivers, this seems to suggest that even minor changes may affect how the bug manifests, as if it was quite sensitive to timings or something like that.
Comment 55 sergio.callegari 2012-10-29 13:08:34 UTC
Quite amazing.

I have updated to ubuntu quantal (12.10) and this bug is gone.

It is gone both for gen4 (dell E6500) and gen3 (eeepc 1000H).

It is amazing, because since I used git versions of libdrm, the xorg intel graphics and mesa, I am actually running the same versions of those that I used to run before.  The same goes with the kernel.

I think that the only thing that has actually changed is the xorg infrastructure.

What remains of the older bug is the following:

- when compositing is switched off or it is off at least for the full screen windows, on the transition between slides is OK

- when compositing is switched on (even for full screen windows), in the transition between the slides gen 4 with kde temporarily inserts a black frame at the beginning of the transistion. Gen3 with unity does not (only a bit of flicker).

Obviously all this is just quite minor in comparison to the previous blocking presentation.
Comment 56 sergio.callegari 2012-11-19 15:08:30 UTC
Issue is back.

With Linux 3.6.7, and mesa 9.1 devel (git snapshot 19/11/2012) as in the oibaf ppa.

This looks as some timing problem, so that very minor things make the issue appear and disappear...

I am sure that it will go unnoticed by most... since kde now disables effects for fullscreen windows by default.
Comment 57 sergio.callegari 2012-11-19 15:09:46 UTC
I forgot to say... tested on gen4
Comment 58 Chris Wilson 2012-11-21 23:46:03 UTC
There is an outside chance this is related to:

commit 9ab1d1f94e502e5fde87e7c171f3502f8a55f22b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Nov 20 18:42:58 2012 +0000

    sna/dri: Queue a vblank-continuation after flip-completion
    
    If a vblank request was delayed due to a pending flip, we need to make
    sure that we then queue it after that flip or else progress cease

Can you please test with -intel.git?
Comment 59 Chris Wilson 2012-11-26 13:44:16 UTC
patch is in 2.20.14 for testing
Comment 60 Chris Wilson 2012-12-16 14:14:29 UTC
Sergio, any news?
Comment 61 Chris Wilson 2012-12-30 10:48:55 UTC
Assuming fixed by the vblank queue fixes.
Comment 62 sergio.callegari 2013-01-10 15:58:36 UTC
Sorry for remaining silent... the spam filter all of a sudden decided that it did not like this thread and I missed the last messages. I only realized it today.

The latest git seems to fix the issue for me!!! I have tested with a couple of presentations with effects on and off for fullscreen windows and I could play all of them fine. Should it happen again I'll reopen the bug.

Many thanks and let me take the occasion to wish you a great 2013!
Comment 63 Chris Wilson 2013-01-10 16:11:20 UTC
Thanks for the update! Please do let me know if you encounter any other issues.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.