Bug 64323

Summary: Severe misrendering in Left 4 Dead 2
Product: Mesa Reporter: Bryan Cain <bryancain3+fdo>
Component: Drivers/DRI/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: alzeih, anakin.cs, bryancain3+fdo, chrisf
Version: git   
Hardware: x86 (IA32)   
OS: Linux (All)   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=64888
Whiteboard:
i915 platform: i915 features:
Attachments: Nuke all reffs of whole_discard introduced by original patch
hack/workaround by calim
a different workaround
a different workaround
a different workaround
yet another try (author imirkin)

Description Bryan Cain 2013-05-07 15:52:49 UTC
This should be filed under component "Drivers/Gallium/nouveau", but that doesn't exist for some reason.

In the nv50 driver, after loading a map in the Left 4 Dead 2 beta,  there are major rendering artifacts that make the game unplayable.  Portal shows a similar problem.

Noticing that stable versions before Mesa 9.1 don't have this problem, I bisected the regression to this commit:

http://cgit.freedesktop.org/mesa/mesa/commit/?h=48a45ec24ae74c0

commit 48a45ec24ae74c00d1487552e94d9f824a428f58
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date:   Tue Jan 8 16:13:11 2013 +0100

    nouveau: improve buffer transfers
    
    Save double memcpy on uploads to VRAM in most cases.
    Properly handle FLUSH_EXPLICIT.
    Reallocate on DISCARD_WHOLE_RESOURCE to avoid sync.

I don't understand the changes made by this commit well enough to fix the regression myself.  I have an apitrace that can be used to reproduce the issue, and will comment on this bug with a link to it as soon as I find a good way to share large (~512 MB) files online.
Comment 1 Bryan Cain 2013-05-07 16:55:43 UTC
Here is a much smaller (40 MB bz2'd, 74 MB uncompressed) trace that demonstrates the regression in the loading and title screen of Portal: http://plombofiles.pbworks.com/f/portal.trace.bz2

On the loading screen, the "Loading..." text in the bottom right corner is not shown after the regression:
http://plombofiles.pbworks.com/f/loading_bad.png
http://plombofiles.pbworks.com/f/loading_good.png

On the title screen, several frames have significant rendering artifacts after the regression:
http://plombofiles.pbworks.com/f/title_bad.png
http://plombofiles.pbworks.com/f/title_good.png
Comment 2 Emil Velikov 2013-06-20 00:29:49 UTC
A few notes

* Rendering is restored to normal by purging all references of DISCARD_WHOLE_RESOURCE from the original patch
* Performance drops by ~8% if dynamic buffers are allocated in vram[1]

[1] Tested using a minute long demo, of jumping through the first portal portal in the game
Comment 3 Emil Velikov 2013-06-27 00:26:38 UTC
Created attachment 81517 [details] [review]
Nuke all reffs of whole_discard introduced by original patch

Just to clarify what I've meant previously, I'm attaching the patch using which rendering appears correct - both during the loading screen and in-game.
Comment 4 Emil Velikov 2013-08-08 14:45:34 UTC
Created attachment 83839 [details] [review]
hack/workaround by calim

Attaching a patch/hack which calim shared a few weeks back. Restores rendering back to normal on my nv96 system
Comment 5 Emil Velikov 2013-08-08 14:48:04 UTC
Dota 2 exhibits the same issue (patches still seem to work)
Comment 6 Emil Velikov 2013-08-09 16:32:18 UTC
On 09/08/13 16:03, Christ-Jan Wijtmans wrote:
> how did you get dota2 to even start? it should not work with the nouveau
> drivers.
> 
Downloaded the game and pressed Start. I'm not sure what you're implying
there with "it should not work", but it works like a charm with the
patch mentioned. Only bugs I can noticed are in dota2 (i.e. they are
present with the blob, and ati/amd users :P).

Last tested with commit 6a2baf3a06125405aa648e208af782e53f1312c0~1,
because of [1].

Cheers
Emil

[1] https://bugs.freedesktop.org/show_bug.cgi?id=67887
> 
> Live long and prosper,
> 
> Christ-Jan Wijtmans
> http://facebook.com/cj.wijtmans
> http://twitter.com/cjwijtmans
> 
> 
> On Thu, Aug 8, 2013 at 4:48 PM, <bugzilla-daemon@freedesktop.org> wrote:
> 
>>   *Comment # 5 <https://bugs.freedesktop.org/show_bug.cgi?id=64323#c5> on bug
>> 64323 <https://bugs.freedesktop.org/show_bug.cgi?id=64323> from Emil
>> Velikov <emil.l.velikov@gmail.com> *
>>
>> Dota 2 exhibits the same issue (patches still seem to work)
>>
>>  ------------------------------
>> You are receiving this mail because:
>>
>>    - You are the assignee for the bug.
>>
>>
>> _______________________________________________
>> Nouveau mailing list
>> Nouveau@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/nouveau
>>
>>
> 
> 
> 
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau
>
Comment 7 AnAkkk 2013-10-28 11:59:01 UTC
I'm getting similar rendering issues with a 8800 GT card in all Source Engine games (tested in Team Fortress 2, Counter-Strike Source and Dota 2), but I'm not sure if it's the same bug.
Mine is similar to the one reported in Valve's tracker:
https://github.com/ValveSoftware/Source-1-Games/issues/200
There's a youtube video with show the exact same problem I am experiencing:
https://www.youtube.com/watch?v=XLtl7VqnXoE

I'm using ArchLinux.
xf86-video-nouveau 1.0.9-1
nouveau-dri 9.2.2-1
libdrm 2.4.47-1
linux 3.11.6-1
mesa 9.2.2-1

Should I open another bug report, or is this the same problem?
Comment 8 Emil Velikov 2013-10-28 22:25:21 UTC
(In reply to comment #7)
> I'm getting similar rendering issues with a 8800 GT card in all Source
> Engine games (tested in Team Fortress 2, Counter-Strike Source and Dota 2),
> but I'm not sure if it's the same bug.
> Mine is similar to the one reported in Valve's tracker:
> https://github.com/ValveSoftware/Source-1-Games/issues/200
> There's a youtube video with show the exact same problem I am experiencing:
> https://www.youtube.com/watch?v=XLtl7VqnXoE
> 
You're experiencing exactly the same issue.

> Should I open another bug report, or is this the same problem?
> 
That should not be needed.

This issue is kind of a sore spot, as even calim (the developer who wrote the gallium driver) is unsure why it occurs.
I'm suspecting some sort of a race/missing fence, although that's wild guess. Note that nvc0+ cards seems to render fine with this commit - so either there is some subtle difference between the nv50/c0 codebase or hardware.

Any input(comments, ideas, patches) would be appreciated :)
Comment 9 AnAkkk 2013-10-28 23:22:09 UTC
Is there anything I can do to help ?
Comment 10 AnAkkk 2013-11-25 13:10:11 UTC
Still happening in 3.13.
Comment 11 Maarten Lankhorst 2013-11-26 11:46:25 UTC
A suspiciously similar bug for nvc0 was fixed with 72295c5f6715555082455072bfd29a1e5ae545de

nvc0: restore viewport after blit

I think you need to set viewport in nv50_blit_3d in the same way nvc0_blit_3d does, and then restore the old viewport afterwards in similar to the referenced commit. I didn't fix it at the time for nv50, because it wasn't setting the viewport. It looks like this was a bug.
Comment 12 Ilia Mirkin 2013-11-28 21:27:17 UTC
Created attachment 89965 [details] [review]
a different workaround

Not sure if this is the right thing to do. Would be interesting to see what impact this has on the framerate, but it does seem to fix the rendering issues, at least with the portal trace in comment 1. (Should be applied separately from any other workaround/patch.)
Comment 13 Ilia Mirkin 2013-11-28 23:25:25 UTC
Created attachment 89968 [details] [review]
a different workaround

actually looks like nouveau_fence_wait does more work than is absolutely necessary. check if it's signalled first.
Comment 14 Ilia Mirkin 2013-11-30 04:17:44 UTC
Created attachment 90011 [details] [review]
a different workaround

OK, I played around with a lot of different options. It actually turns out that my last patch leaks fences. And it waits on fence instead of fence_wr. The new version fixes all these issues.

Sticking in a semaphore acquire instead helped a bit, but the rendering was still not right. Even if I used screen->fence.current. So wait it must be :(
Comment 15 AnAkkk 2013-11-30 12:42:53 UTC
I've tested the last patch on TF2, it fixes the issue. I've done some benchmarking and there is some performance loss though:


Without the patch: avg 62 fps
With the patch: avg 45 fps
(with and without done thrice)


Mesa 9.2.4
Comment 16 Ilia Mirkin 2013-11-30 13:37:24 UTC
(In reply to comment #15)
> I've tested the last patch on TF2, it fixes the issue. I've done some
> benchmarking and there is some performance loss though:
> 
> 
> Without the patch: avg 62 fps
> With the patch: avg 45 fps
> (with and without done thrice)
> 
> 
> Mesa 9.2.4

Well that's a big drop... Of course it's not exactly a decrease since it doesn't work otherwise. Could you try the other suggested patches and see what their performance impact is?
Comment 17 AnAkkk 2013-11-30 15:04:28 UTC
workaround_calim.patch: 46 fps
0001-wait-for-the-buffer-to-be-done.patch: 34 fps
0001-nv50-wait-on-the-buf-s-fence-before-sticking-it-into.patch: 37 fps

What about Maarten Lankhorst's suggestion?
Comment 18 AnAkkk 2013-11-30 16:02:37 UTC
Two others benchmarks shows 44 and 48 fps with your latest patch (had 45 before, as written above).
I'm not sure, but *maybe* it was more laggy than calim patch (like the getting stuck from time to time, as if I had 1 fps). Could just be a placebo though.
Comment 19 Ilia Mirkin 2013-11-30 20:20:26 UTC
Perhaps I was unclear. I was hoping you would test each of the currently-listed patches under "attachments" individually. i.e. 

https://bugs.freedesktop.org/attachment.cgi?id=90011
https://bugs.freedesktop.org/attachment.cgi?id=83839
https://bugs.freedesktop.org/attachment.cgi?id=81517

After testing each one, unapply it (you can use "git checkout" or "patch -R" depending on your setup).

Thanks!
Comment 20 AnAkkk 2013-11-30 20:29:00 UTC
I did test these two individually:
https://bugs.freedesktop.org/attachment.cgi?id=90011
https://bugs.freedesktop.org/attachment.cgi?id=83839

Both gave around 46 fps.

There's just this last one I haven't tested yet:
https://bugs.freedesktop.org/attachment.cgi?id=81517

I'll test it later.
Comment 21 AnAkkk 2013-11-30 22:09:09 UTC
That one:
https://bugs.freedesktop.org/attachment.cgi?id=81517
is the best one, I have same performance with it as without any patch (62 fps).
Comment 22 Ilia Mirkin 2013-11-30 22:53:45 UTC
Just want to confirm a theory --

can you test

https://bugs.freedesktop.org/attachment.cgi?id=81517
https://bugs.freedesktop.org/attachment.cgi?id=90011

*together* (i.e. apply both patches). They shouldn't conflict. My theory is that this will have the same "higher" performance.
Comment 23 AnAkkk 2013-11-30 23:16:18 UTC
Indeed, I still have ~62 FPS with both patches.
Comment 24 Emil Velikov 2013-12-01 00:43:06 UTC
*** Bug 64888 has been marked as a duplicate of this bug. ***
Comment 25 Emil Velikov 2013-12-01 17:01:32 UTC
Created attachment 90056 [details] [review]
yet another try (author imirkin)

Here are some numbers while running dota2. Game is up-to date as of time of this post.

* recorded a timedemo using "record timedemo1"
* executed via "steam -applaunch 570 +con_logfile timedemo1.log +cl_showfps 2 +timedemoquit timedemo1"

mesa master (fb5f5b81883f360dcbbf407a0f6f5606bc0c0495)

* master
903 frames 45.906 seconds 19.67 fps (50.84 ms/f) 2.519 fps variability
Only in-game corruption, flickering textures. HUD is rendered fine

* master + attached patch
903 frames 56.210 seconds 16.06 fps (62.25 ms/f) 0.696 fps variability
No visible corruption or flicker at all.

* master + 81517 patch
903 frames 70.074 seconds 12.89 fps (77.60 ms/f) 0.974 fps variability
Only in-game corruption, flickering textures. Mostly plain colour or rainbow texture. HUD _is affected by the flicker_
Note: it used to work previously...

The rainbow texture is part of some courier special effects. Those effect were not used in this match. Some memory corruption maybe ?

* master + revert commit 48a45ec24ae74c00d1487552e94d9f824a428f58
903 frames 68.625 seconds 13.16 fps (76.00 ms/f) 0.319 fps variability
No visible corruption or flicker at all.
Comment 26 AnAkkk 2013-12-01 17:14:51 UTC
I'd suggest you launch the game normally and use timedemo_runcount 3 + timedemo xx instead. I've seen that the first time you play the demo with timedemo, it is usually less accurate (probably because the game hasn't loaded yet most resources in memory).
For example I'd sometimes get 55 FPS the first time, then a constant 62 every time after.
Comment 27 Emil Velikov 2013-12-01 18:29:01 UTC
(In reply to comment #26)
> I'd suggest you launch the game normally and use timedemo_runcount 3 +
> timedemo xx instead. I've seen that the first time you play the demo with
> timedemo, it is usually less accurate (probably because the game hasn't
> loaded yet most resources in memory).
> For example I'd sometimes get 55 FPS the first time, then a constant 62
> every time after.

Despite using the same engine dota2, is rather different from the other source titles - tf2, cs:s, l4d2. It pre-loads (almost?) every resource required.

Thanks for the timedemo_runcount tip, although dota2 is a bit short in that regard. Running a couple sequential runs, using "timedemoquit" provides rather consistent results.
Comment 28 AnAkkk 2013-12-04 21:36:10 UTC
There's now a commit with the older patch which does the wait and reduce FPS:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=0e5bf8565106833e1a678ebdae81fdf1657391c9

Shouldn't we have the latest one instead (mess around with buffer transfers only when needed) ?
Comment 29 Ilia Mirkin 2013-12-04 21:43:21 UTC
The wait is correct and always needed. I believe that the last patch is actually incorrect. I've added a whole bunch of comments to that function which reflect my understanding of how it works. It appears to be correct.

I think what's happening is that

(a) inline write to GART. no staging, just returns the buffer's memory
(b) something happens
(c) inline write to GART. at this point, the GPU is now reading the buffer. so it has to do the whole staging dance to schedule the write to be done by the GPU.
(d) render happens, and has to wait for that second inline write to complete.

I have yet to figure out what (b) is -- I did some printf-debugging and I'm fairly sure about parts (a) and (c). But by the time (c) happened, the buffer's status had the GPU_READING bit set, and it had a ->fence set on it.

I made yet-another attempt at improving this by forcing the second inline write to wait for the gpu to stop reading and then do the GART copy directly, but that didn't seem to help.
Comment 30 Emil Velikov 2014-01-13 02:28:30 UTC
The fix is present in the 9.2, 10 and master branches. Thanks for the great work Ilia
Comment 31 AnAkkk 2014-01-13 09:39:05 UTC
Are you sure this should be considered fixed? The misrendering issue is fixed, but there's still a problem whose Ilia was talking about in his last post (which is the cause for performance issues after the patch ?).
Comment 32 Ilia Mirkin 2014-01-13 09:44:40 UTC
The issue is resolved. There are various optimizations that one might attempt, but the code path in question apparently _has_ to wait for the GPU to stop writing to that buffer. Any optimization would avoid situations where the GPU is writing to that buffer. Something like valid ranges would be helpful -- I understand that r600 has them, perhaps we can copy them. But this would be strictly an optimization.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.