Summary: | Amnesia/HPL2 Demo: Strange graphical bugs on r600g | ||
---|---|---|---|
Product: | Mesa | Reporter: | Maggioni Marcello <hayarms> |
Component: | Drivers/Gallium/r600 | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | jlp.bugs, sa |
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Screenshot
game log Screenshot No problem Problem proposed patch simple test app Further tests Correct test 2nd patch |
Created attachment 46493 [details]
game log
I also tried to lower every graphical setting to the minimum, but the bug remains. Created attachment 46494 [details]
Screenshot
I uploaded the wrong screenshot the first time
Yeah, I'm having the same problem. The Amnesia devs posted a nifty compatibility test for the game which can be used to to quickly recreate these bugs (the black bar seems to start at test 5). It isn't available from their site any longer so I uploaded it here: http://dl.dropbox.com/u/28577999/RendererFeatTestRound2-Linux32.zip The game (or at least the RenderFeatTest tool) is rendering correctly with llmvpipe. I tried the RenderTest and it seemed to work fine (the SSAO test was completely white though) . I tried the latest git today and then tested Amnesia again. The situation is improved. The game is not bug free, but now the black area in the bottom of the screen has been substituted by a white area that is present only under certain lighting conditions (I've attached two screenshots , one in which the problem isn't present and one in which it is). Also, when present the "disturbance" flickers at a very high frequency, so it's quite difficult to capture it in a screenshot (you catch it while it is appearing or disappearing on the screen very often), so it is bigger than the one that can be seen in the screenshot. Created attachment 47127 [details]
No problem
This is the screenshot where the problem is not present.
Created attachment 47128 [details]
Problem
This is the screenshot where the problem is present
An update. I noticed that in fullscreen the problem is not present with the new drivers (no corruption at the bottom of the screen) , but the problem with the white dots persists. I also noticed that these dots are in random locations, but they surround point light sources like candles. I thought they were random, because you can see them trough walls , so the candles generating the dots were not directly visible, but behind the wall and I haven't immediatly connected this problem with the lights. Created attachment 47621 [details] [review] proposed patch Could you please try this patch ? It should fix the issue. Hi, after applying the patch the problem seems to be still there ... my card is a Radeon 5850. The lights have a white halo surrounding them flickering. Also , the white area at the bottom of the screen problem is still present in fullscreen mode. It was just scene dependent and I thought it was gone because I checked in a scene in which it wasn't present. Created attachment 47657 [details]
simple test app
Weird, here the patch solves the white-dot-around-lights issue. I'm using a HD4850, maybe that's why...
I've attached a small test app, which exhibits the problem fixed by this patch. Without the patch applied it fails almost every time (the failure is immediate).
Could you please compile it and launch it a few times and tell me :
1) if it fails without the patch
2) if so, does the patch brings some improvement ?
Created attachment 47666 [details]
Further tests
Hi, your test application crashes on my system with and without the patch.
I determined that the result that the query should return is 68698 samples each execution.
So I modified your test application to see how many times that value is returned instead of some other random values (the modded app is attached) and the output is this:
[hades@artemis ~]$ ./query
Correct values: 17
Not correct values: 33
Correct values: 50
Not correct values: 0
Correct values: 50
Not correct values: 0
Correct values: 50
Not correct values: 0
Correct values: 50
Not correct values: 0
Correct values: 50
.....
As you can see at the beginning the probability of wrong queries is high and then it corrects itself out. All the subsequent queries seem to be correct. This happens both with your patch and without it.
I tried to step inside the gallium code (r600_query_result) to see what values are returned.
When the value is correct I get values like:
(gdb) n
1641 query->result += end - start;
2: /x end = 0x80000001824291f0
1: /x start = 0x8000000182424e0c
...
when I get an error I get values for start and end like :
(gdb) n
1641 query->result += end - start;
2: /x end = 0xffffffffffff0000
1: /x start = 0xffff0000ffffffff
(gdb) n
1636 for (i = 0; i < size; i += 4) {
2: /x end = 0xffffffffffff0000
1: /x start = 0xffff0000ffffffff
(gdb)
That are obviously wrong.
The very first r600_query_result execution gives correct results, then from the second r600_query_result execution on the results are incorrect and then , after some runs, the values become correct again.
I don't know how to interpret this. Maybe a drm bug? I use kernel 2.6.39
Created attachment 47669 [details]
Correct test
Ups sorry, I completely botched the test application :D
I placed a "!=" instead of a "==" and so all the results in the previous post are inverted.
The query are correct for a little in the beginning and then they get completely wrong afterwards :
[hades@artemis ~]$ ./query
Correct values: 25
Not correct values: 25
Correct values: 0
Not correct values: 50
Correct values: 0
Not correct values: 50
Correct values: 0
Not correct values: 50
Correct values: 0
Not correct values: 50
Correct values: 0
Not correct values: 50
Correct values: 0
Not correct values: 50
Correct values: 0
...
Sorry for the mess, I'm quite tired in this period :p
Remote debugging is hard :-/ Anyway, your modified app is wrong I think (not only the != / == inversion) : The app is supposed to : - draw a cube to a random position - use query to count how many pixels were drawn (so this number change every time the cube position changes) - then it compares the result of all queries : if not equal it fails Now I have no idea why it crashes on your system, did you try to debug it ? maybe there's an obvious mistake in it ? or maybe you could try to reduce the num_queries param ? The issue you've seen with values obviously wrong in 'r600_query_result' is the problem the patch is supposed to address. Without the patch, the 'size' variable is wrong (too big) and the 'results' buffer is read too far. Last, if you have piglit installed, I found a test case in it probably more robust than my simple app : occlusion_query. This test sometimes fails when using non-patched git mesa like this : $ ./bin/occlusion_query -auto samples passed = 400, expected = 400 samples passed = 0, expected = 0 samples passed = 2600, expected = 2600 samples passed = 2147483647, expected = 0 [<- ERROR] samples passed = 400, expected = 400 [...] PIGLIT: {'result': 'fail' } Yes, you are perfectly right, I looked better to the application and you are right, but considering that you don't set the "random seed" value with srand the random sequence is always the same, so the first execution of the of the "while" cycle should always have the same 68698 samples drawn. So at least in the first cycle my test app should report Correct: 50 Non correct: 0 right? This is an example of a sequence of values reported in the first cycle: Param is: 68698 Param is: 2147483647 Param is: 2147483647 Param is: 2147483647 Param is: 2147483647 Param is: 68698 Param is: 2147483647 Param is: 68698 Param is: 68698 Param is: 2147483647 Param is: 68698 Param is: 68698 Param is: 68698 Param is: 2147483647 Param is: 2147483647 Param is: 2147483647 Param is: 2147483647 Param is: 68698 Param is: 68698 Param is: 68698 Param is: 68698 The patch is applied, of this I'm sure. I tried reducing the value of the "size" variable (as you suggested) to 16 (instead of the value that usually is set by num_results that is 32), and the result is : Param is: 34594 Param is: 2147483647 Param is: 34594 Param is: 34594 Param is: 34594 Param is: 2147483647 Param is: 34594 Param is: 34594 Param is: 34594 Param is: 34594 Param is: 2147483647 Param is: 34594 Param is: 34594 Param is: 2147483647 Param is: 34594 The correct values are almost half of the ones with "size = query->num_results" and the non correct ones are the same ... It seems like the memory gets corrupted or not correctly initialized sometimes. Any idea? I'm using a 5670 (Redwood) and the patch seems to solve the problems here. The test application works with the patch and asserts without it: query error : 2147483647 != 68698 query: query.cpp:104: int main(int, char**): Assertion `false' failed. I have only given Amnesia a quick try, but there are no longer any obvious rendering errors, and the piglit test general/occlusion_query now passes. Great job! \(^ ^)/ What kernel version do you use Sven? Created attachment 47680 [details] [review] 2nd patch Here's a second patch to add. It removes some memory initialization logic which seems to be wrong (thanks agd5f for the help). Marcello could you please test it ? (in addition to the 1st patch) If it's still wrong, could you tell me which value the 'num_backends' and 'ctx->max_db' variables have (both in r600_query_begin for instance) ? (In reply to comment #17) > What kernel version do you use Sven? I'm using 2.6.39. The second patch doesn't solve for ... I don't know if it is a mine specific problem or if it is universal for all 5850 owners. this is the output of GDB for the variables you requested in the begin_query function: r600_query_begin (ctx=0x635918, query=0x7fe690) at r600_hw_context.c:1651 1651 { (gdb) 1653 int num_backends = r600_get_num_backends(ctx->radeon); (gdb) 1657 required_space = 16; (gdb) disp ctx->max_db 3: ctx->max_db = 8 (gdb) disp num_backends 4: num_backends = 8 (gdb) Also, I have two cards (using only one of them on linux), I don't know if this actually may create problems. Could you please update your git repo ? agd5f pushed a patch (r600g: always clear query memory) which could correct the problem (you still need to apply the first one too) IT WORKS! Yay! :D I kind of imagined that the problem was in the initialization of the memory somehow and this afternoon I was looking at that piece of cone in begin_query, but I completely missing the fact that it wasn't executed! Great guys, you are awesome :) I still have that strange "corruption" at the bottom of the screen (depending on the scene) , but this is a very minor problem when compared with the one you just solved. I'll let you know if I find out other major problems with the game :) Fixes pushed: 7c1d4781920e34ad5419b4c5cf9d54ae14938844 bdf2e112856659816d000699fce606adc4ee9926 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 46492 [details] Screenshot Hi, this game (amnesia dark descent) seems to run on the r600g driver, but with some graphical bugs that make the experience "not enjoyable". There's a big black corruption in the bottom part of the screen that covers 1/5 (almost) of the screen and strange white dots appear in the air. Water also flickers. The most serious glitch is the first one, the black corruption. I've attached a screenshot of this. I also attached the screenshot and the log from the game. The screenshot and the log is taken from the demo version of the game.