Bug 81551

Summary: [dri3] Mesa does not support explicit fencing
Product: xorg Reporter: Jan Alexander Steffens (heftig) <jan.steffens>
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: RESOLVED WONTFIX QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: a9016009, bugs, bugzilla, drago01, eero.t.tamminen, eugene.shalygin+bugzilla.FDO, fademind, felixonmars, frederic.romagne, hamer.mk, inform, jak, jana, jean-louis, joe, mail, mavoga, naelphin, pierre, sndirsch, tobias.johannes.klausmann
Version: git   
Hardware: All   
OS: Linux (All)   
See Also: https://bugzilla.gnome.org/show_bug.cgi?id=733397
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 79629    
Attachments:
Description Flags
Neuter explicit fencing none

Description Jan Alexander Steffens (heftig) 2014-07-19 20:24:54 UTC
From https://bugzilla.gnome.org/show_bug.cgi?id=733397:

gnome-shell 3.12.2
mutter 3.12.2
clutter 1.18.2
cogl 1.18.2
mesa 10.2.3
xorg-server 1.16.0
xf86-video-intel 2.99.912-246-g4ae346e
linux 3.15.6

Sometimes the screen isn't updated when it should be. I've seen this happen
when scrolling pagewise using the keyboard in gnome-terminal, gedit or Firefox.
The previous content is retained until something else (such as another scroll,
moving the window or visible activity in another window) causes an update.

This started happening since upgrading xorg-server from 1.15.2 to 1.16.0, which
consequently enabled DRI3. It happens in both mutter and gnome-shell. Disabling
DRI3 for the compositor using LIBGL_DRI3_DISABLE=1 makes the problem disappear.

---

When switching from SNA to UXA or Glamor, the compositor won't even start without LIBGL_DRI3_DISABLE. It immediately gets an X BadAlloc error.

This is on a Thinkpad X220 (SNB).
Comment 1 Chris Wilson 2014-07-20 06:18:40 UTC
Have you implemented fences before reading from DRI3 objects yet?
Comment 2 drago01 2014-07-20 06:36:57 UTC
(In reply to comment #1)
> Have you implemented fences before reading from DRI3 objects yet?

No idea whom you mean by "you" but here is the IRC discussion from yesterday:

<drago01> keithp: hi, around?
<keithp> drago01: just for a few minutes
<drago01> keithp: https://bugzilla.gnome.org/show_bug.cgi?id=733397 
<drago01> keithp: seems like with dri3 we no longer have seralized x/gl drawing
<keithp> could be the video driver's fault
<drago01> keithp: so is there any way to fix it without x sync fences (that aren't implemented iirc)
<keithp> uh, x sync fences are required for DRI3
<drago01> does the driver / xserver support them?
<drago01> (no xserver 1.16 setup here to check)
<keithp> the mesa driver expects them, so it depends on whether the 2D driver does them correctly
<keithp> if there aren't sync fences, DRI3 won't work at all
<keithp> might be good to try UXA vs SNA and see if there's a difference
<keithp> in the intel driver
<keithp> I've only ever tested with UXA
<keithp> and gnome-shell was working last time I tried
<keithp> (which was a while ago)
<drago01> by x fences I mean GL_EXT_x11_sync_object which we have patches for but not applied yet
<keithp> oh, we "shouldn't" need those for intel as the 2D driver should be flushing stuff out before delivering damage events
<drago01> ok
<keithp> no difference from DRI2
<drago01> ok will ask the reporter to test uxa (or glamor?)
<keithp> glamor, not so much
<keithp> uxa is a better bet
<drago01> ok
<drago01> keithp: 
<drago01> <heftig> drago01: with glamor or uxa, mutter won't even start without LIBGL_DRI3_DISABLE
<drago01> <heftig> it will immediately get a X BadAlloc error
<drago01> (asked him to just file a driver bug for now)
<keithp> sounds like something is seriously busted; gnome works fine here
<drago01> Jasper has seen some similar redrawing issues too
<Jasper> drago01, keithp: it seems like damage events are sent before the pixmap is actually drawn to.
<drago01> Jasper: with uxa too?
<Jasper> drago01, haven't tested with UXA
Comment 3 Chris Wilson 2014-07-20 07:16:00 UTC
DRI3 relies on explicit fencing.

'<keithp> oh, we "shouldn't" need those for intel as the 2D driver should be flushing stuff out before delivering damage events'

is wrong, per usual.
Comment 4 drago01 2014-07-20 07:23:00 UTC
(In reply to comment #3)
> DRI3 relies on explicit fencing.
> 
> '<keithp> oh, we "shouldn't" need those for intel as the 2D driver should be
> flushing stuff out before delivering damage events'
> 
> is wrong, per usual.

You mean we need fencing in the compositor or can this be done inside the 3D driver? As for doing it in the compositor mesa does not seem to implement GL_EXT_x11_sync_object.
Comment 5 drago01 2014-07-20 07:23:13 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > DRI3 relies on explicit fencing.
> > 
> > '<keithp> oh, we "shouldn't" need those for intel as the 2D driver should be
> > flushing stuff out before delivering damage events'
> > 
> > is wrong, per usual.
> 
> You mean we need fencing in the compositor or can this be done inside the 3D
> driver? As for doing it in the compositor mesa does not seem to implement
> GL_EXT_x11_sync_object.

s/3D driver/2D driver/
Comment 6 Chris Wilson 2014-07-20 07:27:45 UTC
It could be completely neutered inside X by removing the explicit sync model in DRI3.
Comment 7 Chris Wilson 2014-07-22 07:43:19 UTC
Created attachment 103262 [details] [review]
Neuter explicit fencing
Comment 8 Alex Deucher 2014-07-22 13:55:38 UTC
*** Bug 79949 has been marked as a duplicate of this bug. ***
Comment 9 Aliaksandr Stelmachonak 2014-07-23 04:05:59 UTC
Had issues with refresh for GTK application after updating to xorg 1.16 with DRI3 enabled. After applying this patch to xf86-video-intel-2.99.912 issue is gone. Thanks!
Comment 10 Jan Alexander Steffens (heftig) 2014-07-23 07:24:42 UTC
The patch also resolves my issues.
Comment 11 Maarten Lankhorst 2014-07-23 07:58:25 UTC
When applying the workaround:

< RAOF> mlankhorst: ~ppa10 works fine, ta.
< RAOF> mlankhorst: And when I say “works fine” what I actually mean “is no longer frustratingly buggy, now merely sometimes and temporarily leaves stale rendering around”
< mlankhorst> RAOF: yeah try with dri 2?
< RAOF> Yeah, DRI2 worked fine.
< RAOF> To be honest, the stale rendering in DRI3 is now only really noticeable because I'm looking for it.
Comment 12 Tobias Klausmann 2014-07-23 13:41:50 UTC
*** Bug 81401 has been marked as a duplicate of this bug. ***
Comment 13 Chris Wilson 2014-07-24 14:29:59 UTC
*** Bug 80613 has been marked as a duplicate of this bug. ***
Comment 14 Tobias Jakobi 2014-08-14 13:40:19 UTC
Hmm, so bug #80613 (duplicate of this one) mentions that the issues always appear after a suspens/resume cycle.

I was wondering if my bug #81548 is also a consequence of this fencing issues.

Anyway, is there going to be a permanent solution to this problem? Seems like the patch by Chris hasn't landed upstream, so I guess it's considered more like a workaround at the moment.
Comment 15 Andreas Kloeckner 2014-11-16 18:10:26 UTC
I'm having a hard time wrapping my head around this bug. It seems to entail that everyone who installs a reasonably recent (Xorg >= 1.16, video-intel >= 2.99) system with an otherwise unchanged configuration will get an unusable system by default. (Unusable in the sense of missing screen updates left and right.)

What I'm having trouble understanding is the combination of the following three:

- This seems kind of severe, as I described above--unless I'm misunderstanding something.
- There seems to be a bit of communication breakdown, as Chris called Keith "wrong as usual" (again, unless I'm misreading something here...)
- Nobody has paid attention to this since July/August.

Again, I'm an outsider and don't want to cause grief, but if someone could explain what the situation is, I'd very much appreciate it. Thanks!
Comment 16 Chris Wilson 2014-12-09 08:09:11 UTC
We lose. Mesa is not going to gain support for interclient fencing anytime soon.

commit fc984e8953d61901b255422c8f56eb79a2dd2a28
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jul 22 08:38:42 2014 +0100

    sna/dri3: Mesa relies upon implicit fences for X/Compositor synchronisation
    
    The decision has been made that DRI3/intel shall continue with DRI2-style
    implicit fencing for synchronisation between X and clients using pixmaps
    as texture sources. (The other way around uses explicit fencing!)
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=81551
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 17 Chris Wilson 2014-12-09 08:09:31 UTC
*** Bug 83101 has been marked as a duplicate of this bug. ***
Comment 18 Chris Wilson 2014-12-09 08:09:44 UTC
*** Bug 84108 has been marked as a duplicate of this bug. ***
Comment 19 Chris Wilson 2015-02-25 10:11:05 UTC
fwiw the overhead from implicit fencing is ~40% (on byt, bdw-u) in CPU-bound rendering workloads (as the implicit fencing prevents the driver from batching up rendering commands).
Comment 20 William Enright 2016-09-08 16:27:47 UTC
Okay - So serious nube.  I see that a patch is posted for this bug, but as a nube, have absolutely no idea how to apply it.  If someone would be so kind to help me out......

William

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.