Civilization IV consistently segfaults on register allocation when loading a scenario, regardless if it's actual game or the benchmark mode. I am attaching an apitrace [1] leading to the aforementioned segfault. Also attaching a stacktrace, line listing of the crashing point and register dump [2] obtained by running the game through gdb. I should also mention that the game sometimes randomly crashes in a peculiar way before even reaching the main menu. This report is not for that bug, which is not yet properly reported. I am mentioning this because replaying the trace does sometimes trigger that bug, too. Therefore, be advised that unless the trace goes past the main menu, You ought to replay again to reach the point of this bug. [1] https://seriouss.am/etc/civ6-nouveau.trace.xz (apitrace file; xz'ed; 653MiB; 1050MiB uncompressed) [2] https://seriouss.am/etc/civ6-gdb (plaintext; 5.6KiB)
A dump of Nouveau's compiler's[?] debug output, complete with shaders and whatnot: https://seriouss.am/etc/civ6-shaderdump (plaintext; 635KiB; contains ANSI color escape sequences all over)
OK, so this is a previously-known issue. There's another bug filed about it somewhere... crysis maybe? Anyways, it comes down to a problem with the delete_Instruction() in the spill code. When deleting the instruction (Instruction::~Instruction), it clears out its own ValueDef's (ValueDef::set), which should in turn update the relevant Value's defs lists. However this happens in the middle of RA, which means that various instructions are joined into nodes, and value A's defs list ends up in value B's defs list. Now this is where I get confused - when I change the logic to also remove the ValueDef from val->join, this does not help. Further vexing is the fact that this particular spill shouldn't even be happening in the first place - it's a move between 2 LValues which I'm pretty sure are joined to each other. Valgrind catches the first badness where this happens, which is when building live sets after spilling happens. Need to add more breaks and poke around more.
Created attachment 129659 [details] [review] patch that works around the issue The attached patch should work around the issue of spilling a value into itself without invoking the wrath of the underlying bug that caused the whole thing to go south in the first place. I'm not entirely convinced of this patch's correctness, so it will need some careful testing.
It now consistently segfaults in another point instead. Backtrace and whatnot: https://seriouss.am/etc/civ-gdb-2017-02-16 I should note that NV50_PROG_OPTIMIZE=0 prevents a segfault, same as without the patch.
Patch at [1] lets you run Civ6 without disabling optimizations. Please note that this patch will not be upstreamed, though! [1] https://patchwork.freedesktop.org/patch/169870/
As an update, the latest Civ6 (v1.0.0.167) doesn't experience a crash in game with either of the two following graphics stack combinations on GP107M for me: Mesa 17.2.2 / libdrm 2.4.83 / Kernel 4.15-rc1 Mesa 17.4.0-devel (git-546633dce2) / libdrm 2.4.83 / Kernel 4.15-rc1 I ran both the benchmark mode and also the 'Play Now' option. Note: Visual corruptions remain - a number of blocks of blue colour and incorrectly clipped visual elements.
(In reply to Rhys Kidd from comment #6) > As an update, the latest Civ6 (v1.0.0.167) doesn't experience a crash in > game with either of the two following graphics stack combinations on GP107M > for me: > > Mesa 17.2.2 / libdrm 2.4.83 / Kernel 4.15-rc1 > Mesa 17.4.0-devel (git-546633dce2) / libdrm 2.4.83 / Kernel 4.15-rc1 > > I ran both the benchmark mode and also the 'Play Now' option. > > Note: Visual corruptions remain - a number of blocks of blue colour and > incorrectly clipped visual elements. well on Pascal we have 255 registers, on Kepler1 just 63. (In reply to Gediminas Jakutis from comment #4) > It now consistently segfaults in another point instead. > Backtrace and whatnot: https://seriouss.am/etc/civ-gdb-2017-02-16 > > I should note that NV50_PROG_OPTIMIZE=0 prevents a segfault, same as without > the patch. yeah, I know what the problem here is. With higher optimization levels we get wide values being a block of more than just one register. Currently we can't spill those values. I tried to fix it, but also kind of failed...
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1127.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.