86720 – [radeon] Europa Universalis 4 freezing during game start (10.3.3+, still broken on 11.0.2)

Bug 86720 - [radeon] Europa Universalis 4 freezing during game start (10.3.3+, still broken on 11.0.2)

Summary: [radeon] Europa Universalis 4 freezing during game start (10.3.3+, still brok...

Status:	RESOLVED FIXED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/Gallium/r600 (show other bugs)
Version:	11.0
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium critical
Assignee:	i.kalvachev
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	77449
	Show dependency tree / graph

Reported:	2014-11-26 02:37 UTC by glwhieuhghfbjds
Modified:	2019-07-04 01:00 UTC (History)
CC List:	14 users (show)

See Also:	93706
i915 platform:
i915 features:

Attachments
134220 line strace from loading game to freeze to force quit. (2.85 MB, text/plain) 2014-11-26 02:37 UTC, glwhieuhghfbjds	Details
Part of the log during the lockup (2.92 KB, text/plain) 2015-01-08 22:00 UTC, Artem Hluvchynskyi	Details
patch (11.70 KB, patch) 2015-02-07 12:52 UTC, Marek Olšák	Details \| Splinter Review
screenshot with R600_DEBUG=nosb on Mesa 10.6.4 and AMD HD 6870 (957.52 KB, image/png) 2015-08-14 23:47 UTC, noga.dany	Details
EU4 shader #175 in TGSI , unoptmized disassembly, sbdump of all stages and optimized disassembly (564.57 KB, text/plain) 2016-07-20 20:24 UTC, i.kalvachev	Details
EU4 shader #175 in TGSI , unoptmized disassembly, sbdump of all stages with fold_assoc() disabled, and optimized disassembly (567.58 KB, text/plain) 2016-07-20 20:42 UTC, i.kalvachev	Details
EU4 shader #175 sbdump of GCM_DEBUG (302.04 KB, text/plain) 2016-07-24 15:19 UTC, i.kalvachev	Details
View All

Description glwhieuhghfbjds 2014-11-26 02:37:11 UTC

Created attachment 110028 [details]
134220 line strace from loading game to freeze to force quit.

Europa Universalis 4 stops responding after loading from pressing "Play" in the nation select screen.

Not present in Mesa 10.3.2, the problem was introduced in 10.3.3.
Related to the radeon driver, as the Intel driver works without issue.

OS: Arch Linux
GPU: Radeon HD 6950
GPU Driver: xf86-video-ati 7.5.0
CPU: Intel i5-2500K

Comment 1 Michel Dänzer 2014-11-26 03:26:30 UTC

Can you bisect Mesa?

Comment 2 glwhieuhghfbjds 2014-11-26 18:00:52 UTC

After a couple of hours of compiling...

Commit e8c7affa66407932519fc6d82a449b453343d9fc works fine.
Commit d26258166ca056da62536bebdf107e21d9ce92fb introduces the issue.

Comment 3 Marek Olšák 2014-11-27 11:14:59 UTC

It looks like the R600 driver cannot handle some loops and hangs (in an infinite loop probably).

There are some new fixes for Cayman in the master branch. Can you apply them and see if they help?

http://cgit.freedesktop.org/mesa/mesa/log?qt=grep&q=r600

Comment 4 glwhieuhghfbjds 2014-11-28 22:27:29 UTC

Commit 133280120b4bc714bbb7665e383f36ab262c280a, after the fixes for Cayman in master, still has the issue.

Or did you mean of me to cherry-pick specific fixes?

Comment 5 Marek Olšák 2014-12-06 17:24:36 UTC

(In reply to glwhieuhghfbjds from comment #4)
> Commit 133280120b4bc714bbb7665e383f36ab262c280a, after the fixes for Cayman
> in master, still has the issue.
> 
> Or did you mean of me to cherry-pick specific fixes?

No. It looks like Cayman still has some bugs.

Comment 6 Joti Papadopoulos 2014-12-14 19:29:53 UTC

Actually, i'm having this issue as well with a HD5850. So it's probably not Cayman specific

OS: Arch Linux
GPU:Radeon HD5850
Mesa 10.3.5 and 10.5git(last tested about a week ago)
CPU:Phenom 2 X3 720

Comment 7 Artem Hluvchynskyi 2015-01-08 21:57:29 UTC

Same issue with HD5730 and mesa git: Mesa 10.5.0-devel (git-8d2542f).
Tried turning off every available graphics option in the game and using "notiling" (which helped with a similar GPU lockup in Tropico 5) but nothing.

Comment 8 Artem Hluvchynskyi 2015-01-08 22:00:49 UTC

Created attachment 111977 [details]
Part of the log during the lockup

Messages in the attached part of the log are repeating for a while, with monitors going off and on. Eventually the system is usable for a while and then just locks up for good, requiring a hard reset.

Comment 9 João Grego 2015-01-14 17:46:58 UTC

Having the same issue with HD5650. When using mesa 10.2.8 there was no problem, but when I upgraded to mesa 10.4.2 the entire system froze after loading a game, either new or saved.

OS: Gentoo amd64
GPU: Radeon HD5650
GPU driver: xf86-video-ati 7.5.0
Mesa: 10.4.2
CPU: intel i3 M370

Comment 10 Felix Schwarz 2015-01-15 20:00:19 UTC

I can confirm that the issue was introduced (on master) with this commit:

commit 6fcb5520b78cdf1e5013c125501932315a069955
Author: Marek Olšák <marek.olsak@amd.com>
Date:   Tue Oct 28 19:49:44 2014 +0100

    Revert "st/mesa: set MaxUnrollIterations = 255"
    
    This reverts commit 20836c81851e0df29a8ee9c86e5e5388738c840b.
    
    255 is a huge number. If you have a loop with 255 iterations, unrolling it
    will exceed the SM3 instruction limit. Let's use the default again.
    
    The comment about a SM3 limit doesn't make sense. For SM3, we generally
    want 32 (default) or a lower number due to the SM3 instruction limit, which
    is 512 instructions. For SM4, we can try higher numbers if needed, but
    some shaders can end up being pretty huge and shader compilation can take
    more time.
    
    This fixes a shader compile failure on R500/SM3. Reported on IRC.
    
    Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
    Reviewed-by: Brian Paul <brianp@vmware.com>


Marek: You mentioned that the r600 driver might be unable to handle certain loops. Is there anything the community can do to get this fixed? apitrace? Checking for piglit regressions related to the mentioned commit? I assume that you would be able to fix this much better if you could reproduce the problem easily.

Comment 11 Marek Olšák 2015-01-16 05:25:17 UTC

I'd add a PIPE_CAP and allow the driver to set MaxUnrollIterations with it. For r600g, the old value should be used. For other drivers, the new value should be used.

Comment 12 Felix Schwarz 2015-01-16 14:56:37 UTC

(In reply to Marek Olšák from comment #11)
> I'd add a PIPE_CAP and allow the driver to set MaxUnrollIterations with it.

"I'd add" as in "I plan to add" or as in "someone else should fix that by ..."?

Comment 13 Médéric Boquien 2015-02-07 12:22:13 UTC

For what it is worth, a temporary workaround would be setting R600_DEBUG=nosb. I saw that in a comment about bug #88263 and it turns out to work.

Comment 14 Marek Olšák 2015-02-07 12:52:08 UTC

Created attachment 113246 [details] [review]
patch

Hi guys, could you please test this patch?

Comment 15 Felix Schwarz 2015-02-07 18:31:57 UTC

I applied your patch on top of mesa 7ea1e3749738c63388d3bcca327e4e4dd28f17b8 with llvm 3.5 and linux 3.18.5 and I can confirm that this problem fixes the original problem as expected.

One thing that bothers me personally is that no piglit test went bad when the initial change was committed. Marek do you think it would be a good idea as a newbie project to come up with a minimal api trace and try to build a piglit test preventing that kind of problem or is that likely to be a fruitless effort?

Comment 16 Marek Olšák 2015-02-08 12:04:01 UTC

Having a piglit test reproducing the issue would be very useful.

Comment 17 Joti Papadopoulos 2015-02-09 07:48:39 UTC

I can confirm the patch fixes the issue here as well

Comment 18 Felix Schwarz 2015-02-20 20:16:32 UTC

I created an API trace which reproduces the problem for me (https://dl.dropboxusercontent.com/s/hc6v7gdcshj4ljd/eu4.trace-lockup.xz?dl=0 , 120 MB, unfortunately trimming produces only invalid states).  The shaders itself don't look to bad to me but I don't have any experience deducing a miminal testcase from an apitrace (pointers welcome).

Comment 19 Marek Olšák 2015-02-20 21:00:06 UTC

(removing myself from the Cc list, already getting emails from dri-devel)

Comment 20 Joe Glaser 2015-02-21 22:36:44 UTC

Hey everyone,

I am a bit new to working with Mesa and haven't needed to patch anything in it before. However, I am experiencing the GPU overload mentioned in the OC's post (Hitting Play at Nation Select causes the game to attempt to load, but crash when displaying the units and requiring a system reboot). I'd like to apply Marek Olšák's patch file, however I am unfamiliar with the commands I need to use to run it on Fedora 21.

Any ideas?

OS: Fedora 21 x86_64 [Gnome3.14.2]
GPU: Radeon HD 6520G
GPU Driver: Gallium 0.4 on AMD SUMO
CPU: AMD A6-3420M APU with Radeon HD Graphics x4

Comment 21 Benjamin Bellec 2015-02-21 23:45:25 UTC

@Joe Glaser
As said above, an easy workaround is to disable the r600 shader optimizer. Just launch Steam like this:
$ R600_DEBUG=nosb steam

But performance will be not good, and you already have a weak GPU. If it's too slow, then you can try to patch and build Mesa. You can follow this tutorial:
http://forums.fedora-fr.org/viewtopic.php?pid=532589
If you don't read french, you can translate the page with Google Chrome. The translation is still... understandable.

Comment 22 Médéric Boquien 2015-03-17 08:30:10 UTC

Has the patch been committed? I have just tried with Mesa 10.5.1 and the freeze is still there. Thanks!

Comment 23 Benjamin Bellec 2015-03-17 12:21:50 UTC

(In reply to Médéric Boquien from comment #22)
> Has the patch been committed? I have just tried with Mesa 10.5.1 and the
> freeze is still there. Thanks!

No the patch hasn't been committed. Some developer said the patch is not a proper fix. Look at this for more information : http://lists.freedesktop.org/archives/mesa-dev/2015-February/076633.html

Comment 24 noga.dany 2015-07-06 18:40:47 UTC

Still present in 10.6.1 and game is still unplayable on Radeon

Comment 25 Felix Schwarz 2015-07-06 19:37:39 UTC

For Fedora I have a COPR with the Fedora's mesa package plus the workaround proposed by Marek on top. Other than that I think what is really needed as a first step is to extract a minimal (piglit) test case.

Comment 26 noga.dany 2015-08-14 20:47:24 UTC

I tried run it with nosb parameters like mentioned above "R600_DEBUG=nosb force_s3tc_enable=true /usr/bin/steam %U" and game works, but textures looks very bad.

Comment 27 Benjamin Bellec 2015-08-14 22:32:00 UTC

(In reply to noga.dany from comment #26)
> I tried run it with nosb parameters like mentioned above "R600_DEBUG=nosb
> force_s3tc_enable=true /usr/bin/steam %U" and game works, but textures looks
> very bad.

Remove the "force_s3tc_enable=true" in your command.

And be sure to have the s3tc lib installed (libtxc_dxtn). You can check that with this command:
$ glxinfo |grep s3tc

Comment 28 noga.dany 2015-08-14 23:44:54 UTC

(In reply to Benjamin Bellec from comment #27)
> (In reply to noga.dany from comment #26)
> > I tried run it with nosb parameters like mentioned above "R600_DEBUG=nosb
> > force_s3tc_enable=true /usr/bin/steam %U" and game works, but textures looks
> > very bad.
> 
> Remove the "force_s3tc_enable=true" in your command.
> 
> And be sure to have the s3tc lib installed (libtxc_dxtn). You can check that
> with this command:
> $ glxinfo |grep s3tc

Ok, I have installed libtxc_dxtn 64bit and 32bit and tried it with "R600_DEBUG=nosb /usr/bin/steam %U". Unfortunately it doesn't look good either. Screenshot attached.
Mesa 10.6.4
Kernel 4.1.1
AMD HD 6870

So with nosb works but looks bad and without nosb it looks good, but system freezes.

Comment 29 noga.dany 2015-08-14 23:47:28 UTC

Created attachment 117697 [details]
screenshot with R600_DEBUG=nosb on Mesa 10.6.4 and AMD HD 6870

Comment 30 a.t.martens 2015-09-29 03:12:49 UTC

(In reply to noga.dany from comment #29)
> Created attachment 117697 [details]
> screenshot with R600_DEBUG=nosb on Mesa 10.6.4 and AMD HD 6870

I think the graphics glitches that appear with nosb can be fixed if you disable some of the effects. I can't remember which at the moment. I was able to run with no apparent issues on my HD 6850 after turning on nosb.

Comment 31 noga.dany 2015-10-13 06:40:19 UTC

Tested on 11.0.2 and still it freezes

Comment 32 Benjamin Bellec 2015-10-20 21:00:57 UTC

Marek pushed the fix.
So it's likely to be fixed in Mesa 11.0.4 which should be released in less than a week.

Comment 33 Lukáš Krejza 2016-01-14 11:24:25 UTC

I have found possibly related bug / regression on current Mesa git: https://bugs.freedesktop.org/show_bug.cgi?id=93706

Comment 34 i.kalvachev 2016-07-20 20:24:03 UTC

Created attachment 125183 [details]
EU4 shader #175 in TGSI , unoptmized disassembly, sbdump of all stages and optimized disassembly

While the committed workaround does work for this case, the bug in R600 Shader Backend is not fixed and it is triggered by other more complicated shaders. For example:
https://bugs.freedesktop.org/show_bug.cgi?id=94900

I had locally reverted the unroll workaround in order to obtain the form that triggers this bug. If you need to test the bug with this shader, then in `r600_pipe.c:559` you have to set `PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT` to 32, instead of 255.

The buggy shader is the vertex shader of call 1024042 in the trace.
When using `R600_DEBUG=ps,vs`, the shader is under #175 .


Like in the other bugreport, this shader causes assertion failure in the sb_checker (if mesa is compiled with debugging) and the bug is also workarounded by `R600_DEBUG=sbsafemath`.

This works because it disables the call to `fold_assoc()` in `expr_handler::fold_alu_op2()` somewhere around `sb_expr.cpp:740`

In order to locate the bug, I've enabled the sbdump for all SB stages.

I'm also uploading a second log, with the "fold_assoc()" disabled, so a side-by-side comparison of both logs could indicate how the function affects the result through the different stages. (I recommend `diffuse` program.)

This shader is easier to analyze, because it contains just one loop with 4 iterations and no other conditional branches and jumps. The loop counter register is used as index for memory access. The memory address calculations might be involved in triggering the bug as `fold_assoc()` works on them.
The `sb_checker` complains about instructions that list the counter register, so it is possible that the instruction that increments it is somehow "optimized" out.

Comment 35 i.kalvachev 2016-07-20 20:42:45 UTC

Created attachment 125184 [details]
EU4 shader #175 in TGSI , unoptmized disassembly, sbdump of all stages with fold_assoc() disabled, and optimized disassembly

This second log is with the "fold_assoc()" disabled, so a side-by-side comparison of both logs could indicate how the function affects the result through the different stages. (I recommend `diffuse` program.)

At first look, the fold_assoc() seems to affect the code after sb "gvn" stage. From what I see the results seem ok-ish. The first drastic changes happen after "gcm" stage.

Comment 36 i.kalvachev 2016-07-24 15:19:14 UTC

Created attachment 125290 [details]
EU4 shader #175 sbdump of GCM_DEBUG

I think I have located the cause of the hang.
Here are two highly reduced exerts from eu4.hang*.log
----------------------------------------------------------------------
###### after ra_split
###### after def_use
    (copy) MOV     t84,    0|00000000
    region #0   live_before: [...]
        {
            * phi     t65,       t84, t98
        }
        repeat region #0 after {
            (copy) MOV     R21.x.5,    t65
            PRED_SETGE_INT     __, __, EM.2,    R21.x.5, 5.60519e-45|00000004
            ADD_INT     R21.x.6,    R21.x.5, 1.4013e-45|00000001
            (copy) MOV     t98,    R21.x.6
        } end_repeat   

###### after gcm
    (copy) MOV     R21.x.5,    t65
    (copy) MOV     t84,    0|00000000
    region #0   live_before: [...]
	{
            * phi     t65,       t84, t98
	}
        repeat region #0 after {
            PRED_SETGE_INT     __, __, EM.2,    R21.x.5, 5.60519e-45|00000004
                    ADD_INT     R21.x.6,    R21.x.5, 1.4013e-45|00000001
                    (copy) MOV     t98,    R21.x.6
        } end_repeat   
----------------------------------------------------------------------

in C the above looks like:
-----------------------------------
//after def_use
	t84 = 0;
	t65 = t84; goto loop_body;
loop_repeat:
        t65 = t98;
loop_body:
        r21x5 = t65; //<----
        if( r21x5 > 4 ) break;
        r21x6 = r21x5 + 1;
        t98   = r21x6;
        goto loop_repeat;
-----------------------------------
//after gcm
        r21x5 = t65; //<----
	t84 = 0;
	t65 = t84; goto loop_body;
loop_repeat:
        t65 = t98;
loop_body:
        if( r21x5 > 4 ) break;
        r21x6 = r21x5 + 1;
        t98   = r21x6;
        goto loop_repeat;
-----------------------------------
The count variable update (R21.x.5) is moved outside the loop
thus it is never incremented and never reaches its final value.

There are comments for something similar in sb_gvn.cpp:57 and sb_gcm:583 .
In the nested loop the inner loop counter initialization is moved outside the outer loop.

I've enabled GCM_DEBUG and I'm attaching the output it have generated.
(The rest of the output is identical to eu4.hang.*.log)


As of the question - Why "sb safe math" workarounds the problem?
The "fold_assoc" replaces a register with a constant value inside *phi operator, but *phi is not a real instruction. Thus in later passes the collapsed instructions should be recreated, this time using a temporal variable (t65).
I suspect that t65 is mistaken for a constant, thus allowing moving outside the loop...

Comment 37 Marek Olšák 2016-09-17 20:23:39 UTC

How about we make R600_DEBUG=sbsafemath be the default?

Comment 38 Heiko 2016-12-13 08:01:51 UTC

Does the patch (fix to sb itself) from https://bugs.freedesktop.org/show_bug.cgi?id=94900 help this issue as well?

Comment 39 Timothy Arceri 2019-07-04 01:00:57 UTC

Yes this looks like it would very likely have been fixed by:

commit e933246013eef376804662f3fcf4646c143c6c88
Author: Heiko Przybyl <lil_tux@web.de>
Date:   Sun Nov 20 14:42:28 2016 +0100

    r600/sb: Fix loop optimization related hangs on eg

Lets close it for now. Feel free to reopen if this is not the case.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.