Bug 93546

Summary: Civilization 5 - Leaders in the diplomatic interactions screen appear completely black
Product: Mesa Reporter: Hadrien Nilsson <freedesktop>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium    
Version: 11.0   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Dump of the shader leading to a "translation from TGSI failed" error

Description Hadrien Nilsson 2015-12-31 10:48:49 UTC
Hi,

I am playing Civilization 5 (the native port) on Ubuntu 15.10 with a Radeon HD4870 and the default drivers shipped with Ubuntu (Mesa 11.0.2).

Everything works fine with the exception of a screen used for diplomatic interactions which shows a nation leader. That nation leader appears completely black and at the same time, the FPS drops a lot. Here is an example: http://i.imgur.com/ma66qyT.jpg

I ran the game with LIBGL_ALWAYS_SOFTWARE set and tested with a random nation leader. He was rendered correctly.
Comment 1 Laurent carlier 2015-12-31 11:12:06 UTC
Can you provide an apitrace?
Comment 2 Hadrien Nilsson 2016-01-01 19:06:13 UTC
I am having a hard time using apitrace on my x64 system with a 32 bits game. I will be digging into this, but meanwhile, I can give some more information:

When I run the game from a terminal, I have those two errors repeated several times:

EE ../../../../../../src/gallium/drivers/r600/r600_shader.c:158 r600_pipe_shader_create - translation from TGSI failed !
EE ../../../../../../src/gallium/drivers/r600/r600_state_common.c:809 r600_shader_select - Failed to build shader variant (type=1) -1

I saw in the source code that the shader can be dumped, so I will attach the dump output to this bug ticket.
Comment 3 Hadrien Nilsson 2016-01-01 19:07:17 UTC
Created attachment 120756 [details]
Dump of the shader leading to a "translation from TGSI failed" error

Just after that shader output, I have those errors in the console:

EE ../../../../../../src/gallium/drivers/r600/r600_shader.c:158 r600_pipe_shader_create - translation from TGSI failed !
EE ../../../../../../src/gallium/drivers/r600/r600_state_common.c:809 r600_shader_select - Failed to build shader variant (type=1) -1
Comment 4 Nicolai Hähnle 2016-01-04 15:38:12 UTC
You need to produce a 32 bit build of apitrace. You'll need multiarch/multilib support in your distribution and 32-bit variants of certain packages (e.g. on Debian/Ubuntu, install the g++-multilib package, and libwhatever-dev:i386 if you get error messages for libwhatever). When running cmake, add the following parameters:

-DCMAKE_C_FLAGS=-m32 -DCMAKE_CXX_FLAGS=-m32 -DCMAKE_EXE_LINKER_FLAGS=-m32 -DENABLE_GUI=FALSE

Traces recorded using the 32 bit apitrace can be played back on apitrace of any bitness.
Comment 5 Hadrien Nilsson 2016-01-25 18:48:39 UTC
Thanks Nicolai. I've been able to produce a trace file but it is huge and the replay gives some weird geometry rendering. However I've been able to get to the point where the character is drawn. The geometry is still incorrect but at least the pixels are black like during the live run of the game. It allowed me to perform step-by-step debugging into the r600 Mesa code after I rebuilt it from source with debugging options.

The TGSI program looks a bit different than when I output it with the environment variable but I get the same error than with the original mesa libraries released with Ubuntu.

What it looks like is that the temporary variables of the TGSI program are nearly directly mapped to registers, but the rv770 "only" have 128 registers. The TGSI program I dumped uses nearly 400 temporary variables. This might explain the error.

Maybe some registers could be reused to keep below the limit? I've been browsing a bit about this kind of problem and it seems to be a typical computer-science compilation problem. A recurring solution I've read about is a graph coloring algorithm. However I do not know where any of the register allocation optimization should occur. In the r600 code or when the TGSI program is generated so all GPU specific code could benefit of it? In any case it looks like a tough problem.
Comment 6 MWATTT 2018-08-15 19:07:29 UTC
This issue should be fixed on master.
Comment 7 Jim Spencer 2019-05-15 08:51:49 UTC
I can produce a trace file without any issue. 
https://bit.ly/2CVHGCM
Comment 8 Hadrien Nilsson 2019-05-15 18:36:48 UTC
Unfortunately I do not have a Radeon HD 4870 anymore so I have no way to retest this bug.
Comment 9 GitLab Migration User 2019-09-18 19:20:30 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/567.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.