Bug 88561 - [radeonsi][regression,bisected] Depth test/buffer issues in Portal
Summary: [radeonsi][regression,bisected] Depth test/buffer issues in Portal
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
Depends on:
Reported: 2015-01-18 17:34 UTC by Daniel Scharrer
Modified: 2015-05-23 20:47 UTC (History)
0 users

See Also:
i915 platform:
i915 features:

Bad frame (493.81 KB, image/jpeg)
2015-01-18 17:34 UTC, Daniel Scharrer
Good frame (492.23 KB, image/jpeg)
2015-01-18 17:36 UTC, Daniel Scharrer
Vertex issues in TF2 (310.53 KB, image/jpeg)
2015-02-05 07:26 UTC, Daniel Scharrer
possible fix (3.03 KB, patch)
2015-02-19 12:12 UTC, Marek Olšák
Details | Splinter Review
Junk rendered in The Talos Principle (136.22 KB, image/jpeg)
2015-05-08 23:09 UTC, Daniel Scharrer

Description Daniel Scharrer 2015-01-18 17:34:27 UTC
Created attachment 112428 [details]
Bad frame

With current Mesa master (commit 461103ef64858e9d81073fad1bd8222b70b2754b), (some) background geometry will be randomly drawn above foreground for some frames in Portal (also observed in the Portal mod Rexaura). It does not happen often and the same scene may be fine depending on what was loaded before or just randomness.

Bisecting blames commit 02ba7334d35cf8182048c17a149b16f18104c6bf
Author: Marek Olšák <marek.olsak@amd.com>
Date:   Sun Jan 4 22:16:53 2015 +0100

    radeonsi: don't use TC L2 for updating descriptors on SI
    It's causing problems, because we mix uncached CP DMA with cached WRITE_DATA
    when updating the same memory.
    The solution for SI is to use uncached access here, because CP DMA doesn't
    support cached access.
    CIK will be handled in the next patch.
    Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

Simply reverting this commit on master causes other rendering issues.

GPU: Radeon HD 7950 (OpenGL renderer string: Gallium 0.4 on AMD TAHITI)
LLVM compiled from svn last week (r225584)
GCC: 4.8.4
Kernel: Linux 3.18.2-gentoo
Distro: Gentoo
Comment 1 Daniel Scharrer 2015-01-18 17:36:32 UTC
Created attachment 112429 [details]
Good frame
Comment 2 Marek Olšák 2015-02-04 11:56:18 UTC
Does the issue occur in any freely-available Steam game like Team Fortress 2 and DOTA 2? If not, could you please create an apitrace file reproducing the issue?
Comment 3 Daniel Scharrer 2015-02-05 07:26:21 UTC
Created attachment 113176 [details]
Vertex issues in TF2

I've re-tested Portal with updated mesa (commit 2335153ff2fae01d6294876a86d3eab59c6c4236) and kernel (3.18.5-gentoo) and the issue is still there.

There also seems to be a rendering regression with TF2 for me, but it looks different. Where in Portal objects appear in front of others that they should be behind of, in TF 2 they disappear entirely or appear to have garbage vertices. I haven't yet tested that the TF2 issues were introduced in the same commit.

I'll create apitraces of both later today.
Comment 4 Daniel Scharrer 2015-02-05 15:33:26 UTC
Here is an apitrace from portal that consistently triggers the bug for me (however, which frames are misrendered varies):

 http://constexpr.org/tmp/Portal-radeonsi.trace.xz (193 MiB)
Comment 5 Daniel Scharrer 2015-02-18 20:37:55 UTC
For some reason this happens a lot less frequently now than it used to. With

Mesa git-8a71fd8
LLVM r229671
Kernel 3.19.0-gentoo

I need to re-run the apitrace multiple times before triggering the bug, while before there were many bad frames per run.

There are also other rendering errors in Portal that look more like the ones I get in TF2 - I guess those are more likely to be related to 88978.

I'm not sure what changed - the bug still happens infrequently when reverting to either Linux 3.18.6-gentoo or to Mesa git-2335153. Either way, the bug is still there, just a lot harder to reproduce now.
Comment 6 Marek Olšák 2015-02-19 12:12:06 UTC
Created attachment 113659 [details] [review]
possible fix

Please test this patch. It seems to fix the bug for the apitrace.
Comment 7 Lorenzo Bona 2015-02-19 14:37:04 UTC
(In reply to Marek Olšák from comment #6)
> Created attachment 113659 [details] [review] [review]
> possible fix
> Please test this patch. It seems to fix the bug for the apitrace.

Thank you Marek.

Your patch fix this bug https://bugs.freedesktop.org/show_bug.cgi?id=88978 too.
Comment 8 Daniel Scharrer 2015-02-19 19:25:27 UTC
Hi Marek,

the patch does improve things a lot: with it I can no longer reproduce any glitches in my Portal trace or in the Dota 2 trace from bug 88978 comment 2. However, in-game some glitches remain.

In Portal they are extremely infrequent with your patch, however disappearing terrain patches and garbage vertices are still relatively easy to reproduce in Team Fortress 2. (Maybe this fits belongs in bug 88978?) I tried to record an apitrace of TF2 but it does not replay correctly. Just loading up a practice map (I tried Dustbowl) and walking around should be enough.
Comment 9 Daniel Scharrer 2015-05-08 23:09:47 UTC
Created attachment 115651 [details]
Junk rendered in The Talos Principle

This still happens in various Source engine games (and perhaps elsewhere), just not as frequently.

The The Talos Principle trace from bug 87278 comment 29 has some garbage being rendered (screenshot attached) that looks similar to what I get in Source engine games, but it is much easier to reproduce:

 http://constexpr.org/tmp/Talos-radeonsi.3.trace.xz (147 MiB)

No idea of that is the same bug or even if the current Source engine issue is related to the original Portal bug I bisected in this bug.
Comment 10 Daniel Scharrer 2015-05-23 20:47:35 UTC
I think Marek's patch fixed the original Portal issue I bisected in radeonsi. The remakning glitches seem to be caused by a LLVM regression. I've added the relevant information to bug #88978.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.