Created attachment 110028 [details] 134220 line strace from loading game to freeze to force quit. Europa Universalis 4 stops responding after loading from pressing "Play" in the nation select screen. Not present in Mesa 10.3.2, the problem was introduced in 10.3.3. Related to the radeon driver, as the Intel driver works without issue. OS: Arch Linux GPU: Radeon HD 6950 GPU Driver: xf86-video-ati 7.5.0 CPU: Intel i5-2500K
Can you bisect Mesa?
After a couple of hours of compiling... Commit e8c7affa66407932519fc6d82a449b453343d9fc works fine. Commit d26258166ca056da62536bebdf107e21d9ce92fb introduces the issue.
It looks like the R600 driver cannot handle some loops and hangs (in an infinite loop probably). There are some new fixes for Cayman in the master branch. Can you apply them and see if they help? http://cgit.freedesktop.org/mesa/mesa/log?qt=grep&q=r600
Commit 133280120b4bc714bbb7665e383f36ab262c280a, after the fixes for Cayman in master, still has the issue. Or did you mean of me to cherry-pick specific fixes?
(In reply to glwhieuhghfbjds from comment #4) > Commit 133280120b4bc714bbb7665e383f36ab262c280a, after the fixes for Cayman > in master, still has the issue. > > Or did you mean of me to cherry-pick specific fixes? No. It looks like Cayman still has some bugs.
Actually, i'm having this issue as well with a HD5850. So it's probably not Cayman specific OS: Arch Linux GPU:Radeon HD5850 Mesa 10.3.5 and 10.5git(last tested about a week ago) CPU:Phenom 2 X3 720
Same issue with HD5730 and mesa git: Mesa 10.5.0-devel (git-8d2542f). Tried turning off every available graphics option in the game and using "notiling" (which helped with a similar GPU lockup in Tropico 5) but nothing.
Created attachment 111977 [details] Part of the log during the lockup Messages in the attached part of the log are repeating for a while, with monitors going off and on. Eventually the system is usable for a while and then just locks up for good, requiring a hard reset.
Having the same issue with HD5650. When using mesa 10.2.8 there was no problem, but when I upgraded to mesa 10.4.2 the entire system froze after loading a game, either new or saved. OS: Gentoo amd64 GPU: Radeon HD5650 GPU driver: xf86-video-ati 7.5.0 Mesa: 10.4.2 CPU: intel i3 M370
I can confirm that the issue was introduced (on master) with this commit: commit 6fcb5520b78cdf1e5013c125501932315a069955 Author: Marek Olšák <marek.olsak@amd.com> Date: Tue Oct 28 19:49:44 2014 +0100 Revert "st/mesa: set MaxUnrollIterations = 255" This reverts commit 20836c81851e0df29a8ee9c86e5e5388738c840b. 255 is a huge number. If you have a loop with 255 iterations, unrolling it will exceed the SM3 instruction limit. Let's use the default again. The comment about a SM3 limit doesn't make sense. For SM3, we generally want 32 (default) or a lower number due to the SM3 instruction limit, which is 512 instructions. For SM4, we can try higher numbers if needed, but some shaders can end up being pretty huge and shader compilation can take more time. This fixes a shader compile failure on R500/SM3. Reported on IRC. Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Marek: You mentioned that the r600 driver might be unable to handle certain loops. Is there anything the community can do to get this fixed? apitrace? Checking for piglit regressions related to the mentioned commit? I assume that you would be able to fix this much better if you could reproduce the problem easily.
I'd add a PIPE_CAP and allow the driver to set MaxUnrollIterations with it. For r600g, the old value should be used. For other drivers, the new value should be used.
(In reply to Marek Olšák from comment #11) > I'd add a PIPE_CAP and allow the driver to set MaxUnrollIterations with it. "I'd add" as in "I plan to add" or as in "someone else should fix that by ..."?
For what it is worth, a temporary workaround would be setting R600_DEBUG=nosb. I saw that in a comment about bug #88263 and it turns out to work.
Created attachment 113246 [details] [review] patch Hi guys, could you please test this patch?
I applied your patch on top of mesa 7ea1e3749738c63388d3bcca327e4e4dd28f17b8 with llvm 3.5 and linux 3.18.5 and I can confirm that this problem fixes the original problem as expected. One thing that bothers me personally is that no piglit test went bad when the initial change was committed. Marek do you think it would be a good idea as a newbie project to come up with a minimal api trace and try to build a piglit test preventing that kind of problem or is that likely to be a fruitless effort?
Having a piglit test reproducing the issue would be very useful.
I can confirm the patch fixes the issue here as well
I created an API trace which reproduces the problem for me (https://dl.dropboxusercontent.com/s/hc6v7gdcshj4ljd/eu4.trace-lockup.xz?dl=0 , 120 MB, unfortunately trimming produces only invalid states). The shaders itself don't look to bad to me but I don't have any experience deducing a miminal testcase from an apitrace (pointers welcome).
(removing myself from the Cc list, already getting emails from dri-devel)
Hey everyone, I am a bit new to working with Mesa and haven't needed to patch anything in it before. However, I am experiencing the GPU overload mentioned in the OC's post (Hitting Play at Nation Select causes the game to attempt to load, but crash when displaying the units and requiring a system reboot). I'd like to apply Marek Olšák's patch file, however I am unfamiliar with the commands I need to use to run it on Fedora 21. Any ideas? OS: Fedora 21 x86_64 [Gnome3.14.2] GPU: Radeon HD 6520G GPU Driver: Gallium 0.4 on AMD SUMO CPU: AMD A6-3420M APU with Radeon HD Graphics x4
@Joe Glaser As said above, an easy workaround is to disable the r600 shader optimizer. Just launch Steam like this: $ R600_DEBUG=nosb steam But performance will be not good, and you already have a weak GPU. If it's too slow, then you can try to patch and build Mesa. You can follow this tutorial: http://forums.fedora-fr.org/viewtopic.php?pid=532589 If you don't read french, you can translate the page with Google Chrome. The translation is still... understandable.
Has the patch been committed? I have just tried with Mesa 10.5.1 and the freeze is still there. Thanks!
(In reply to Médéric Boquien from comment #22) > Has the patch been committed? I have just tried with Mesa 10.5.1 and the > freeze is still there. Thanks! No the patch hasn't been committed. Some developer said the patch is not a proper fix. Look at this for more information : http://lists.freedesktop.org/archives/mesa-dev/2015-February/076633.html
Still present in 10.6.1 and game is still unplayable on Radeon
For Fedora I have a COPR with the Fedora's mesa package plus the workaround proposed by Marek on top. Other than that I think what is really needed as a first step is to extract a minimal (piglit) test case.
I tried run it with nosb parameters like mentioned above "R600_DEBUG=nosb force_s3tc_enable=true /usr/bin/steam %U" and game works, but textures looks very bad.
(In reply to noga.dany from comment #26) > I tried run it with nosb parameters like mentioned above "R600_DEBUG=nosb > force_s3tc_enable=true /usr/bin/steam %U" and game works, but textures looks > very bad. Remove the "force_s3tc_enable=true" in your command. And be sure to have the s3tc lib installed (libtxc_dxtn). You can check that with this command: $ glxinfo |grep s3tc
(In reply to Benjamin Bellec from comment #27) > (In reply to noga.dany from comment #26) > > I tried run it with nosb parameters like mentioned above "R600_DEBUG=nosb > > force_s3tc_enable=true /usr/bin/steam %U" and game works, but textures looks > > very bad. > > Remove the "force_s3tc_enable=true" in your command. > > And be sure to have the s3tc lib installed (libtxc_dxtn). You can check that > with this command: > $ glxinfo |grep s3tc Ok, I have installed libtxc_dxtn 64bit and 32bit and tried it with "R600_DEBUG=nosb /usr/bin/steam %U". Unfortunately it doesn't look good either. Screenshot attached. Mesa 10.6.4 Kernel 4.1.1 AMD HD 6870 So with nosb works but looks bad and without nosb it looks good, but system freezes.
Created attachment 117697 [details] screenshot with R600_DEBUG=nosb on Mesa 10.6.4 and AMD HD 6870
(In reply to noga.dany from comment #29) > Created attachment 117697 [details] > screenshot with R600_DEBUG=nosb on Mesa 10.6.4 and AMD HD 6870 I think the graphics glitches that appear with nosb can be fixed if you disable some of the effects. I can't remember which at the moment. I was able to run with no apparent issues on my HD 6850 after turning on nosb.
Tested on 11.0.2 and still it freezes
Marek pushed the fix. So it's likely to be fixed in Mesa 11.0.4 which should be released in less than a week.
I have found possibly related bug / regression on current Mesa git: https://bugs.freedesktop.org/show_bug.cgi?id=93706
Created attachment 125183 [details] EU4 shader #175 in TGSI , unoptmized disassembly, sbdump of all stages and optimized disassembly While the committed workaround does work for this case, the bug in R600 Shader Backend is not fixed and it is triggered by other more complicated shaders. For example: https://bugs.freedesktop.org/show_bug.cgi?id=94900 I had locally reverted the unroll workaround in order to obtain the form that triggers this bug. If you need to test the bug with this shader, then in `r600_pipe.c:559` you have to set `PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT` to 32, instead of 255. The buggy shader is the vertex shader of call 1024042 in the trace. When using `R600_DEBUG=ps,vs`, the shader is under #175 . Like in the other bugreport, this shader causes assertion failure in the sb_checker (if mesa is compiled with debugging) and the bug is also workarounded by `R600_DEBUG=sbsafemath`. This works because it disables the call to `fold_assoc()` in `expr_handler::fold_alu_op2()` somewhere around `sb_expr.cpp:740` In order to locate the bug, I've enabled the sbdump for all SB stages. I'm also uploading a second log, with the "fold_assoc()" disabled, so a side-by-side comparison of both logs could indicate how the function affects the result through the different stages. (I recommend `diffuse` program.) This shader is easier to analyze, because it contains just one loop with 4 iterations and no other conditional branches and jumps. The loop counter register is used as index for memory access. The memory address calculations might be involved in triggering the bug as `fold_assoc()` works on them. The `sb_checker` complains about instructions that list the counter register, so it is possible that the instruction that increments it is somehow "optimized" out.
Created attachment 125184 [details] EU4 shader #175 in TGSI , unoptmized disassembly, sbdump of all stages with fold_assoc() disabled, and optimized disassembly This second log is with the "fold_assoc()" disabled, so a side-by-side comparison of both logs could indicate how the function affects the result through the different stages. (I recommend `diffuse` program.) At first look, the fold_assoc() seems to affect the code after sb "gvn" stage. From what I see the results seem ok-ish. The first drastic changes happen after "gcm" stage.
Created attachment 125290 [details] EU4 shader #175 sbdump of GCM_DEBUG I think I have located the cause of the hang. Here are two highly reduced exerts from eu4.hang*.log ---------------------------------------------------------------------- ###### after ra_split ###### after def_use (copy) MOV t84, 0|00000000 region #0 live_before: [...] { * phi t65, t84, t98 } repeat region #0 after { (copy) MOV R21.x.5, t65 PRED_SETGE_INT __, __, EM.2, R21.x.5, 5.60519e-45|00000004 ADD_INT R21.x.6, R21.x.5, 1.4013e-45|00000001 (copy) MOV t98, R21.x.6 } end_repeat ###### after gcm (copy) MOV R21.x.5, t65 (copy) MOV t84, 0|00000000 region #0 live_before: [...] { * phi t65, t84, t98 } repeat region #0 after { PRED_SETGE_INT __, __, EM.2, R21.x.5, 5.60519e-45|00000004 ADD_INT R21.x.6, R21.x.5, 1.4013e-45|00000001 (copy) MOV t98, R21.x.6 } end_repeat ---------------------------------------------------------------------- in C the above looks like: ----------------------------------- //after def_use t84 = 0; t65 = t84; goto loop_body; loop_repeat: t65 = t98; loop_body: r21x5 = t65; //<---- if( r21x5 > 4 ) break; r21x6 = r21x5 + 1; t98 = r21x6; goto loop_repeat; ----------------------------------- //after gcm r21x5 = t65; //<---- t84 = 0; t65 = t84; goto loop_body; loop_repeat: t65 = t98; loop_body: if( r21x5 > 4 ) break; r21x6 = r21x5 + 1; t98 = r21x6; goto loop_repeat; ----------------------------------- The count variable update (R21.x.5) is moved outside the loop thus it is never incremented and never reaches its final value. There are comments for something similar in sb_gvn.cpp:57 and sb_gcm:583 . In the nested loop the inner loop counter initialization is moved outside the outer loop. I've enabled GCM_DEBUG and I'm attaching the output it have generated. (The rest of the output is identical to eu4.hang.*.log) As of the question - Why "sb safe math" workarounds the problem? The "fold_assoc" replaces a register with a constant value inside *phi operator, but *phi is not a real instruction. Thus in later passes the collapsed instructions should be recreated, this time using a temporal variable (t65). I suspect that t65 is mistaken for a constant, thus allowing moving outside the loop...
How about we make R600_DEBUG=sbsafemath be the default?
Does the patch (fix to sb itself) from https://bugs.freedesktop.org/show_bug.cgi?id=94900 help this issue as well?
Yes this looks like it would very likely have been fixed by: commit e933246013eef376804662f3fcf4646c143c6c88 Author: Heiko Przybyl <lil_tux@web.de> Date: Sun Nov 20 14:42:28 2016 +0100 r600/sb: Fix loop optimization related hangs on eg Lets close it for now. Feel free to reopen if this is not the case.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.