This should be filed under component "Drivers/Gallium/nouveau", but that doesn't exist for some reason.
In the nv50 driver, after loading a map in the Left 4 Dead 2 beta, there are major rendering artifacts that make the game unplayable. Portal shows a similar problem.
Noticing that stable versions before Mesa 9.1 don't have this problem, I bisected the regression to this commit:
Author: Christoph Bumiller <email@example.com>
Date: Tue Jan 8 16:13:11 2013 +0100
nouveau: improve buffer transfers
Save double memcpy on uploads to VRAM in most cases.
Properly handle FLUSH_EXPLICIT.
Reallocate on DISCARD_WHOLE_RESOURCE to avoid sync.
I don't understand the changes made by this commit well enough to fix the regression myself. I have an apitrace that can be used to reproduce the issue, and will comment on this bug with a link to it as soon as I find a good way to share large (~512 MB) files online.
Here is a much smaller (40 MB bz2'd, 74 MB uncompressed) trace that demonstrates the regression in the loading and title screen of Portal: http://plombofiles.pbworks.com/f/portal.trace.bz2
On the loading screen, the "Loading..." text in the bottom right corner is not shown after the regression:
On the title screen, several frames have significant rendering artifacts after the regression:
A few notes
* Rendering is restored to normal by purging all references of DISCARD_WHOLE_RESOURCE from the original patch
* Performance drops by ~8% if dynamic buffers are allocated in vram
 Tested using a minute long demo, of jumping through the first portal portal in the game
Created attachment 81517 [details] [review]
Nuke all reffs of whole_discard introduced by original patch
Just to clarify what I've meant previously, I'm attaching the patch using which rendering appears correct - both during the loading screen and in-game.
Created attachment 83839 [details] [review]
hack/workaround by calim
Attaching a patch/hack which calim shared a few weeks back. Restores rendering back to normal on my nv96 system
Dota 2 exhibits the same issue (patches still seem to work)
On 09/08/13 16:03, Christ-Jan Wijtmans wrote:
> how did you get dota2 to even start? it should not work with the nouveau
Downloaded the game and pressed Start. I'm not sure what you're implying
there with "it should not work", but it works like a charm with the
patch mentioned. Only bugs I can noticed are in dota2 (i.e. they are
present with the blob, and ati/amd users :P).
Last tested with commit 6a2baf3a06125405aa648e208af782e53f1312c0~1,
because of .
> Live long and prosper,
> Christ-Jan Wijtmans
> On Thu, Aug 8, 2013 at 4:48 PM, <firstname.lastname@example.org> wrote:
>> *Comment # 5 <https://bugs.freedesktop.org/show_bug.cgi?id=64323#c5> on bug
>> 64323 <https://bugs.freedesktop.org/show_bug.cgi?id=64323> from Emil
>> Velikov <email@example.com> *
>> Dota 2 exhibits the same issue (patches still seem to work)
>> You are receiving this mail because:
>> - You are the assignee for the bug.
>> Nouveau mailing list
> Nouveau mailing list
I'm getting similar rendering issues with a 8800 GT card in all Source Engine games (tested in Team Fortress 2, Counter-Strike Source and Dota 2), but I'm not sure if it's the same bug.
Mine is similar to the one reported in Valve's tracker:
There's a youtube video with show the exact same problem I am experiencing:
I'm using ArchLinux.
Should I open another bug report, or is this the same problem?
(In reply to comment #7)
> I'm getting similar rendering issues with a 8800 GT card in all Source
> Engine games (tested in Team Fortress 2, Counter-Strike Source and Dota 2),
> but I'm not sure if it's the same bug.
> Mine is similar to the one reported in Valve's tracker:
> There's a youtube video with show the exact same problem I am experiencing:
You're experiencing exactly the same issue.
> Should I open another bug report, or is this the same problem?
That should not be needed.
This issue is kind of a sore spot, as even calim (the developer who wrote the gallium driver) is unsure why it occurs.
I'm suspecting some sort of a race/missing fence, although that's wild guess. Note that nvc0+ cards seems to render fine with this commit - so either there is some subtle difference between the nv50/c0 codebase or hardware.
Any input(comments, ideas, patches) would be appreciated :)
Is there anything I can do to help ?
Still happening in 3.13.
A suspiciously similar bug for nvc0 was fixed with 72295c5f6715555082455072bfd29a1e5ae545de
nvc0: restore viewport after blit
I think you need to set viewport in nv50_blit_3d in the same way nvc0_blit_3d does, and then restore the old viewport afterwards in similar to the referenced commit. I didn't fix it at the time for nv50, because it wasn't setting the viewport. It looks like this was a bug.
Created attachment 89965 [details] [review]
a different workaround
Not sure if this is the right thing to do. Would be interesting to see what impact this has on the framerate, but it does seem to fix the rendering issues, at least with the portal trace in comment 1. (Should be applied separately from any other workaround/patch.)
Created attachment 89968 [details] [review]
a different workaround
actually looks like nouveau_fence_wait does more work than is absolutely necessary. check if it's signalled first.
Created attachment 90011 [details] [review]
a different workaround
OK, I played around with a lot of different options. It actually turns out that my last patch leaks fences. And it waits on fence instead of fence_wr. The new version fixes all these issues.
Sticking in a semaphore acquire instead helped a bit, but the rendering was still not right. Even if I used screen->fence.current. So wait it must be :(
I've tested the last patch on TF2, it fixes the issue. I've done some benchmarking and there is some performance loss though:
Without the patch: avg 62 fps
With the patch: avg 45 fps
(with and without done thrice)
(In reply to comment #15)
> I've tested the last patch on TF2, it fixes the issue. I've done some
> benchmarking and there is some performance loss though:
> Without the patch: avg 62 fps
> With the patch: avg 45 fps
> (with and without done thrice)
> Mesa 9.2.4
Well that's a big drop... Of course it's not exactly a decrease since it doesn't work otherwise. Could you try the other suggested patches and see what their performance impact is?
workaround_calim.patch: 46 fps
0001-wait-for-the-buffer-to-be-done.patch: 34 fps
0001-nv50-wait-on-the-buf-s-fence-before-sticking-it-into.patch: 37 fps
What about Maarten Lankhorst's suggestion?
Two others benchmarks shows 44 and 48 fps with your latest patch (had 45 before, as written above).
I'm not sure, but *maybe* it was more laggy than calim patch (like the getting stuck from time to time, as if I had 1 fps). Could just be a placebo though.
Perhaps I was unclear. I was hoping you would test each of the currently-listed patches under "attachments" individually. i.e.
After testing each one, unapply it (you can use "git checkout" or "patch -R" depending on your setup).
I did test these two individually:
Both gave around 46 fps.
There's just this last one I haven't tested yet:
I'll test it later.
is the best one, I have same performance with it as without any patch (62 fps).
Just want to confirm a theory --
can you test
*together* (i.e. apply both patches). They shouldn't conflict. My theory is that this will have the same "higher" performance.
Indeed, I still have ~62 FPS with both patches.
*** Bug 64888 has been marked as a duplicate of this bug. ***
Created attachment 90056 [details] [review]
yet another try (author imirkin)
Here are some numbers while running dota2. Game is up-to date as of time of this post.
* recorded a timedemo using "record timedemo1"
* executed via "steam -applaunch 570 +con_logfile timedemo1.log +cl_showfps 2 +timedemoquit timedemo1"
mesa master (fb5f5b81883f360dcbbf407a0f6f5606bc0c0495)
903 frames 45.906 seconds 19.67 fps (50.84 ms/f) 2.519 fps variability
Only in-game corruption, flickering textures. HUD is rendered fine
* master + attached patch
903 frames 56.210 seconds 16.06 fps (62.25 ms/f) 0.696 fps variability
No visible corruption or flicker at all.
* master + 81517 patch
903 frames 70.074 seconds 12.89 fps (77.60 ms/f) 0.974 fps variability
Only in-game corruption, flickering textures. Mostly plain colour or rainbow texture. HUD _is affected by the flicker_
Note: it used to work previously...
The rainbow texture is part of some courier special effects. Those effect were not used in this match. Some memory corruption maybe ?
* master + revert commit 48a45ec24ae74c00d1487552e94d9f824a428f58
903 frames 68.625 seconds 13.16 fps (76.00 ms/f) 0.319 fps variability
No visible corruption or flicker at all.
I'd suggest you launch the game normally and use timedemo_runcount 3 + timedemo xx instead. I've seen that the first time you play the demo with timedemo, it is usually less accurate (probably because the game hasn't loaded yet most resources in memory).
For example I'd sometimes get 55 FPS the first time, then a constant 62 every time after.
(In reply to comment #26)
> I'd suggest you launch the game normally and use timedemo_runcount 3 +
> timedemo xx instead. I've seen that the first time you play the demo with
> timedemo, it is usually less accurate (probably because the game hasn't
> loaded yet most resources in memory).
> For example I'd sometimes get 55 FPS the first time, then a constant 62
> every time after.
Despite using the same engine dota2, is rather different from the other source titles - tf2, cs:s, l4d2. It pre-loads (almost?) every resource required.
Thanks for the timedemo_runcount tip, although dota2 is a bit short in that regard. Running a couple sequential runs, using "timedemoquit" provides rather consistent results.
There's now a commit with the older patch which does the wait and reduce FPS:
Shouldn't we have the latest one instead (mess around with buffer transfers only when needed) ?
The wait is correct and always needed. I believe that the last patch is actually incorrect. I've added a whole bunch of comments to that function which reflect my understanding of how it works. It appears to be correct.
I think what's happening is that
(a) inline write to GART. no staging, just returns the buffer's memory
(b) something happens
(c) inline write to GART. at this point, the GPU is now reading the buffer. so it has to do the whole staging dance to schedule the write to be done by the GPU.
(d) render happens, and has to wait for that second inline write to complete.
I have yet to figure out what (b) is -- I did some printf-debugging and I'm fairly sure about parts (a) and (c). But by the time (c) happened, the buffer's status had the GPU_READING bit set, and it had a ->fence set on it.
I made yet-another attempt at improving this by forcing the second inline write to wait for the gpu to stop reading and then do the GART copy directly, but that didn't seem to help.
The fix is present in the 9.2, 10 and master branches. Thanks for the great work Ilia
Are you sure this should be considered fixed? The misrendering issue is fixed, but there's still a problem whose Ilia was talking about in his last post (which is the cause for performance issues after the patch ?).
The issue is resolved. There are various optimizations that one might attempt, but the code path in question apparently _has_ to wait for the GPU to stop writing to that buffer. Any optimization would avoid situations where the GPU is writing to that buffer. Something like valid ranges would be helpful -- I understand that r600 has them, perhaps we can copy them. But this would be strictly an optimization.