Bug 75276

Summary: Implement VGPR Register Spilling
Product: Mesa Reporter: Tom Stellard <tstellar>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: ansla80, bu9zilla, daniel, darkbasic, elad, equeim, farmboy0+freedesktop, haagch, idd997733t, john.ettedgui, koper84, liquid.acid, marti, mattkramara, OmegaPhil, rainyday26, step2back+freedesktop, waltercool
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 73320, 75005, 75211, 75361    
Attachments: SGPR spilling fix
tesseract segfault when setting morphological AA to ultra
VGPR spilling work-around
VGPR Spill Work Around v2
VGPR Spill Work Around v3 with possible antichamber crash fix
antichamber log with r600_debug
VGPR Spill Work Around v4 with possible antichamber crash fix
backtrace of unreal engine effects demo with debug
Possible fix
Build failure with "Implement VGPR register spilling v3" patch
TGSI failing shader
llvm failing shader

Description Tom Stellard 2014-02-20 18:56:45 UTC
Register spilling is not implemented for VGPRs causing a number of applications to crash.
Comment 1 Tom Stellard 2014-02-24 17:45:59 UTC
Created attachment 94675 [details] [review]
SGPR spilling fix

Some of these crashed may be caused by a bug in SGPR spilling rather than lack of VGPR spilling.
Comment 2 Michel Dänzer 2014-03-04 06:47:11 UTC
Tom put up an LLVM Git branch: http://cgit.freedesktop.org/~tstellar/llvm/log/?h=si-spill-fixes
Comment 3 Christoph Haag 2014-04-17 13:11:37 UTC
(In reply to comment #2)
> Tom put up an LLVM Git branch:
> http://cgit.freedesktop.org/~tstellar/llvm/log/?h=si-spill-fixes

Newest mesa git needs a newer llvm than your branch:


gallivm/lp_bld_debug.cpp: In function 'size_t disassemble(const void*, llvm::raw_ostream&)':
gallivm/lp_bld_debug.cpp:255:79: error: no matching function for call to 'llvm::Target::createMCDisassembler(const llvm::MCSubtargetInfo&, llvm::MCContext&) const'
Comment 4 farmboy0+freedesktop 2014-04-28 18:46:03 UTC
Whom do I have to buy a beer or three to get this implemented yesterday?
Tom is there anything I could do to help you along with this bug?
Comment 5 Tom Stellard 2014-04-28 20:51:26 UTC
Updated branch for testing:

http://cgit.freedesktop.org/~tstellar/llvm/log/?h=si-spill-fixes-v2
Comment 6 Tom Stellard 2014-04-29 21:36:08 UTC
Updated v3 branch here:
http://cgit.freedesktop.org/~tstellar/llvm/log/?h=si-spill-fixes-v3
Comment 7 Christoph Haag 2014-05-13 15:00:57 UTC
Created attachment 98982 [details]
tesseract segfault when setting morphological AA to ultra

Okay, the good news is:  works pretty well with your v3 branch.

With current latest svn, Tesseract 2014-05-11 just segfaults on loading maps somewhere.

But with your v3 branch it works pretty well.

The only crash with your v3 branch I got (every time) when setting morphological anti aliasing to ultra either ingame, or when starting a game. bt full attached. Maybe it's unrelated, but maybe not...?
Comment 8 Tom Stellard 2014-05-16 19:27:02 UTC
Created attachment 99169 [details] [review]
VGPR spilling work-around
Comment 9 Tom Stellard 2014-05-17 03:02:14 UTC
*** Bug 75361 has been marked as a duplicate of this bug. ***
Comment 10 Tom Stellard 2014-05-17 03:02:39 UTC
*** Bug 75211 has been marked as a duplicate of this bug. ***
Comment 11 Tom Stellard 2014-05-17 03:02:55 UTC
*** Bug 73320 has been marked as a duplicate of this bug. ***
Comment 12 Tom Stellard 2014-05-17 03:03:55 UTC
Created attachment 99186 [details] [review]
VGPR Spill Work Around v2

This patch should build.
Comment 13 Tom Stellard 2014-05-17 03:11:36 UTC
Created attachment 99187 [details] [review]
VGPR Spill Work Around v3 with possible antichamber crash fix

Here is an updated version of the patch, which may fix the antichamber crash.
Comment 14 farmboy0+freedesktop 2014-05-17 08:37:07 UTC
The v3 patch doesnt fix Antichamber's crash to desktop.
But with this patch it actually shows the VGPR messages:

LLVM triggered Diagnostic Handler: SIInstrInfo::storeRegToStackSlot - Can't spill VGPR!
LLVM triggered Diagnostic Handler: SIInstrInfo::loadRegToStackSlot - Can't retrieve spilled VGPR!

Here's the start of the backtrace for reference:

Program received signal SIGSEGV, Segmentation fault.
0xf504f671 in std::pair<llvm::SlotIndex, llvm::SlotIndex>::operator= (this=0x450d2ff4, __p=...)
    at /usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/include/g++-v4/bits/stl_pair.h:153
153             second = __p.second;
(gdb) bt
#0  0xf504f671 in std::pair<llvm::SlotIndex, llvm::SlotIndex>::operator= (this=0x450d2ff4, __p=...)
    at /usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/include/g++-v4/bits/stl_pair.h:153
#1  0xf505e8c5 in llvm::IntervalMapImpl::NodeBase<std::pair<llvm::SlotIndex, llvm::SlotIndex>, llvm::LiveInterval*, 16u>::copy<16u> (
    this=0x44ee9644, Other=..., i=250679, j=250678, Count=4294967295)
    at /mnt/daten/Daten/Repositories/llvm/include/llvm/ADT/IntervalMap.h:231
#2  0xf505f3ad in llvm::IntervalMapImpl::NodeBase<std::pair<llvm::SlotIndex, llvm::SlotIndex>, llvm::LiveInterval*, 16u>::moveLeft (
    this=0x44ee9644, i=1, j=0, Count=4294967295) at /mnt/daten/Daten/Repositories/llvm/include/llvm/ADT/IntervalMap.h:242
#3  0xf505ec81 in llvm::IntervalMapImpl::NodeBase<std::pair<llvm::SlotIndex, llvm::SlotIndex>, llvm::LiveInterval*, 16u>::erase (
    this=0x44ee9644, i=0, j=1, Size=0) at /mnt/daten/Daten/Repositories/llvm/include/llvm/ADT/IntervalMap.h:263
#4  0xf505e175 in llvm::IntervalMapImpl::NodeBase<std::pair<llvm::SlotIndex, llvm::SlotIndex>, llvm::LiveInterval*, 16u>::erase (
    this=0x44ee9644, i=0, Size=0) at /mnt/daten/Daten/Repositories/llvm/include/llvm/ADT/IntervalMap.h:270
#5  0xf505d1ee in llvm::IntervalMap<llvm::SlotIndex, llvm::LiveInterval*, 16u, llvm::IntervalMapInfo<llvm::SlotIndex> >::iterator::erase (
    this=0xffff6348) at /mnt/daten/Daten/Repositories/llvm/include/llvm/ADT/IntervalMap.h:1876
#6  0xf505c846 in llvm::LiveIntervalUnion::extract (this=0x44ee9640, VirtReg=...) at LiveIntervalUnion.cpp:68
#7  0xf50655ef in llvm::LiveRegMatrix::unassign (this=0x44d26e60, VirtReg=...) at LiveRegMatrix.cpp:96
#8  0xf511e81f in (anonymous namespace)::RAGreedy::tryLastChanceRecoloring (this=0x44243000, VirtReg=..., Order=..., NewVRegs=..., 
    FixedRegisters=..., Depth=0) at RegAllocGreedy.cpp:2067
#9  0xf511f224 in (anonymous namespace)::RAGreedy::selectOrSplitImpl (this=0x44243000, VirtReg=..., NewVRegs=..., FixedRegisters=..., 
    Depth=0) at RegAllocGreedy.cpp:2285
#10 0xf511eb60 in (anonymous namespace)::RAGreedy::selectOrSplit (this=0x44243000, VirtReg=..., NewVRegs=...) at RegAllocGreedy.cpp:2144
Comment 15 darkbasic 2014-05-17 12:40:13 UTC
Tesseract does still crash with si-spill-fixes-v4 if I set "Shadow resolution" to "Ultra" or "Global Illumination" to "High" or "Morphological AA" to "Ultra".
Comment 16 darkbasic 2014-05-17 14:54:10 UTC
If someone is interested here is my rebased si-spill-fixes-v4: https://github.com/darkbasic/llvm/tree/master-si-spill-fixes-v4

I will try to rebase it from time to time, you cal also find the matching compiler-rt, clang and clang-tools-extra: https://github.com/darkbasic
Comment 17 Tom Stellard 2014-05-17 17:40:07 UTC
(In reply to comment #15)
> Tesseract does still crash with si-spill-fixes-v4 if I set "Shadow
> resolution" to "Ultra" or "Global Illumination" to "High" or "Morphological
> AA" to "Ultra".

Have you tested with the patch attached to this bug?
Comment 18 Tom Stellard 2014-05-17 17:41:24 UTC
(In reply to comment #14)
> The v3 patch doesnt fix Antichamber's crash to desktop.
> But with this patch it actually shows the VGPR messages:
> 

With the patch applied can you run the game with R600_DEBUG=ps,vs,gs and post the output?
Comment 19 farmboy0+freedesktop 2014-05-17 18:44:41 UTC
Created attachment 99236 [details]
antichamber log with r600_debug
Comment 20 darkbasic 2014-05-18 00:32:55 UTC
> Have you tested with the patch attached to this bug?

Still not because I was late and llvm takes a long time to compile, but I will test it tomorrow.
Comment 21 Tom Stellard 2014-05-18 00:37:24 UTC
Created attachment 99254 [details] [review]
VGPR Spill Work Around v4 with possible antichamber crash fix

Can you try this patch?  It should fix the antichamber crash.
Comment 22 farmboy0+freedesktop 2014-05-18 10:41:53 UTC
Patch v4 fixes the crash in Antichamber:
LLVM triggered Diagnostic Handler: SIInstrInfo::storeRegToStackSlot - Can't spill VGPR!
LLVM triggered Diagnostic Handler: SIInstrInfo::loadRegToStackSlot - Can't retrieve spilled VGPR!
radeon_llvm_compile: Processing Diag Flag
LLVM failed to compile shader
EE si_state.c:2133 si_shader_select - Failed to build shader variant (type=1) 1
Comment 23 darkbasic 2014-05-19 13:50:32 UTC
Here is tesseract with R600_DEBUG=ps,vs,gs:
http://bpaste.net/show/287559/
Comment 24 darkbasic 2014-05-19 13:50:57 UTC
I forgot to say I'm using attachment 99254 [details] [review].
Comment 25 darkbasic 2014-05-19 22:51:43 UTC
With si-spill-fixes-v4 Painkiller Hell & Damnation does not crash, but it still stutters alot making the game completely unplayable. What's the problem?
Comment 26 Tom Stellard 2014-06-16 17:08:38 UTC
I've pushed a work-arond for the crash to LLVM trunk, so if you use the latest version of LLVM from svn/git these games should not crash, but there may be some mis-rendering.

When I come up with a proper fix, I will post it to this bug (this could be a while).
Comment 27 Christoph Haag 2014-06-28 09:58:46 UTC
(In reply to comment #26)
> I've pushed a work-arond for the crash to LLVM trunk, so if you use the
> latest version of LLVM from svn/git these games should not crash, but there
> may be some mis-rendering.

It's a bit unclear what that means and because of that it's unclear what are actual rendering bugs in other parts of the driver.

When you talk about mis-rendering, are these the kind of problems we should see:

https://www.youtube.com/watch?v=6i0vJW5k5js&hd=1
(The setting is "Texture Quality" and "Medium" works better with less artifacts and "High" produces much more artifacts)

Or is this something different that should be reported separately?
(I'm seeing the same problems in the "Distance" game, closed alpha version)

Also, when starting a map a rendering thread in Sanctum 2 segfaults, but I also don't know if I can blame it on this workaround, maybe someone else can try whether this can be reproduced first?
Comment 28 Damian Nowak 2014-07-03 21:20:56 UTC
> LLVM ERROR: ran out of registers during register allocation

I get this error when trying out Unreal Engine demo buid. Coredumps always around the same moment for me, from 9th to 11th second. Here's the link to the demo in case anyone wants to apitrace and analyze it. http://ue4linux.raxxy.com/effects_cave_demo.zip
(Demo comes from https://wiki.unrealengine.com/Linux_Demos)
Comment 29 Pablo Cholaky 2014-07-13 04:33:54 UTC
I don't know if will help a lot, but a crash is "always reproducible" with LLVM 3.4.2 using my OLAND 8750M with "Civilization V", always on a leader scene.

Using LLVM from GIT I don't reproduce this, but is VERY slow, like darkbasic said. Is only on the leader scene.
Comment 30 Christoph Haag 2014-07-24 11:58:17 UTC
(In reply to comment #28)
> > LLVM ERROR: ran out of registers during register allocation

I do not get this message on upstream llvm recent revisions.

But every demo segfaults. Mostly in LLVMBuildBitCast().

Running a demo looks like this:


$ DRI_PRIME=1 ./Effects
Using binned.
4.3.0-0+UE4 7038 3077 379 0
Signal 11 caught.
EngineCrashHandler: Signal=11


Exiting due to error
Starting ../../../Engine/Binaries/Linux/CrashReportClient
[1]    10932 abort (core dumped)  DRI_PRIME=1 ./Effects


This is still a problem with register spilling that just looks different, right?
Should I compile with debug symbols and get a complete backtrace or wouldn't that provide any new information?


(By the way, applying this small patch makes it render almost completely correct on intel: https://bugs.freedesktop.org/show_bug.cgi?id=78716#c10)
Comment 31 Michel Dänzer 2014-07-28 10:08:29 UTC
(In reply to comment #30)
> But every demo segfaults. Mostly in LLVMBuildBitCast().
[...]
> This is still a problem with register spilling that just looks different,
> right?

No, that sounds like bug 81834.
Comment 32 Christoph Haag 2014-07-28 11:10:48 UTC
Created attachment 103583 [details]
backtrace of unreal engine effects demo with debug

(In reply to comment #31)
> (In reply to comment #30)
> > But every demo segfaults. Mostly in LLVMBuildBitCast().
> [...]
> > This is still a problem with register spilling that just looks different,
> > right?
> 
> No, that sounds like bug 81834.

So I built mesa with debug symbols and I guess debugging enables some assertions because now fails at some assertion about Register.Index stuff. Full backtrace attached.

(bug 81834 doesn't say what versions he uses. I use upstream llvm 214022 and mesa git with the small patch I mentioned)

Of course it could be both problems at the same time.
Comment 33 Michel Dänzer 2014-07-29 03:31:48 UTC
(In reply to comment #32)
> > No, that sounds like bug 81834.
> 
> So I built mesa with debug symbols and I guess debugging enables some
> assertions because now fails at some assertion about Register.Index stuff.

Yep, that looks like bug 81834. I'm actually not sure why I'm not failing the assertion in Mesa before the one in LLVM.


> (bug 81834 doesn't say what versions he uses.

I normally use current Git of everything.
Comment 34 Michel Dänzer 2014-07-29 03:33:25 UTC
(In reply to comment #33)
> > So I built mesa with debug symbols and I guess debugging enables some
> > assertions because now fails at some assertion about Register.Index stuff.
> 
> Yep, that looks like bug 81834.

BTW, you may be able to work around this by reverting Mesa commit f4b0ab7afd83c811329211eae8167c9bf238870c, but then you may run into bug 80880 instead.
Comment 35 J. Andrew Lanz-O'Brien 2014-08-28 12:15:56 UTC
Hello. I'm just wondering if this has been resolved in Mesa 10.3, and if not, what needs to be done to get it there. Thank you!
Comment 36 darkbasic 2014-08-28 12:19:31 UTC
It must be solved in LLVM, not mesa. You can use my forward-ported llvm git branch if you need register spilling: https://github.com/darkbasic/llvm
Comment 37 Christoph Haag 2014-08-28 12:21:40 UTC
Comment on attachment 103583 [details]
backtrace of unreal engine effects demo with debug

If with "this" you mean the Unreal Engine troubles, they are solved and they should run.
Comment 38 T3st 2014-08-29 07:19:15 UTC
Looks like I stepped on similar issue at http://llvm.org/bugs/show_bug.cgi?id=20738 when dealing with OpenCL and rather filed bug against LLVM (not that like if they have right sections for AMD GPUs in theie bugzilla, but it seems to be LLVM issue, yeah?...).

Basically, running simple CL memory benchmark causes huge flood by error messages from LLVM and then GPU locks up, which is not fun at all, obviously.
Comment 39 J. Andrew Lanz-O'Brien 2014-09-07 13:28:16 UTC
This bug appears to be resolved for me now, as of LLVM 3.5 and Mesa 10.2.7 on Arch Linux. Thanks for the hard work!!!
Comment 40 Daniel Scharrer 2014-09-07 16:10:27 UTC
It's still reproducible in Painkiller H&D with LLVM r202464 (ie. current trunk).

While some previously broken PP effects now work, dynamic lighting is still messed up and there's plenty of this in the log:

LLVM triggered Diagnostic Handler: SIInstrInfo::storeRegToStackSlot - Can't spill VGPR!
LLVM triggered Diagnostic Handler: SIInstrInfo::loadRegToStackSlot - Can't retrieve spilled VGPR!
Comment 41 Daniel Scharrer 2014-09-07 16:12:46 UTC
s/r202464/commit 3666e7f4c161c50e5f6dcb0e015ca16bf69fb941/ ;-)
Comment 42 darkbasic 2014-09-07 20:21:19 UTC
The Witcher 2 is full of spilling warnings too.
Comment 43 Michael Mair-Keimberger 2014-09-13 17:39:32 UTC
FYI, while other games doesn't crash anymore with llvm-3.5/mesa-10.2.7 (painkiller, the cave, serious sam), brütal legends horribly crash's my system with this versions. With older versions it just closes/crash's the game, but now it crash's the whole system and i always have to reset it.
Comment 44 Michel Dänzer 2014-09-17 06:55:49 UTC
(In reply to comment #43)
> FYI, while other games doesn't crash anymore with llvm-3.5/mesa-10.2.7
> (painkiller, the cave, serious sam), brütal legends horribly crash's my
> system with this versions.

The Brütal Legend issue might be unrelated to this bug report then and should be tracked in a separate report for now.
Comment 45 Tom Stellard 2014-09-19 01:11:52 UTC
Created attachment 106527 [details] [review]
Possible fix

Here is a patch to test.
Comment 46 Daniel Scharrer 2014-09-19 23:58:29 UTC
Created attachment 106569 [details]
Build failure with "Implement VGPR register spilling v3" patch

What llvm version is that patch supposed to apply to?

Build fails with a current checkout of http://llvm.org/git/llvm.git master - see the attached log.
Comment 47 Marek Olšák 2014-09-20 09:01:20 UTC
(In reply to comment #45)
> Created attachment 106527 [details] [review] [review]
> Possible fix
> 
> Here is a patch to test.

Hi Tom,

The Mesa driver doesn't set LDS_SIZE for VS, GS, ES, and HS shaders, so I don't think the spilling to LDS will work on those.
Comment 48 equeim 2014-09-24 17:11:39 UTC
Reproducible crash in Pioneer (remake of Elite 2) with LLVM-3.5/Mesa-10.3
Comment 49 Fred Santos 2014-09-25 20:38:08 UTC
Hello,

I don't know if that helps, but the bug is still present on the brand new openSUSE 13.2 beta1, for example. (With libLLVM 3.4.2 and mesa 10.3.0)

In my case, it can be easily reproduced when trying to launch Tomb Raider 2013 through Wine or Playonlinux. The game seems to start normally but crashes after a few seconds and I get this error message :
'LLVM ERROR: ran out of registers during register allocation'

For the record, I have an AMD APU A8-7600 with Radeon R7 Graphics.
Comment 50 equeim 2014-09-26 17:16:25 UTC
(In reply to comment #49)
> Hello,
> 
> I don't know if that helps, but the bug is still present on the brand new
> openSUSE 13.2 beta1, for example. (With libLLVM 3.4.2 and mesa 10.3.0)

This is LLVM bug and it is partially fixed in LLVM 3.5
Comment 51 John 2014-11-08 09:55:10 UTC
Not sure if it is exactly related to this bug, but I can't play NeverWinterNight, I get a lot of blocs like these:

LLVM triggered Diagnostic Handler: SIInstrInfo::storeRegToStackSlot - Do not know how to spill register
LLVM triggered Diagnostic Handler: SIInstrInfo::loadRegFromStackSlot - Do not know how to restore register
LLVM triggered Diagnostic Handler: Ran out of VGPRs for spilling SGPR
then:
radeon_llvm_compile: Processing Diag Flag
LLVM failed to compile shader
EE si_state.c:2282 si_shader_select - Failed to build shader variant (type=1) 1
and then back to above

I'm on git for llvm, clang, mesa (radeonsi).
Using Linux 3.17.2, x64 if that matters.
Comment 52 Tom Stellard 2014-11-10 14:34:14 UTC
(In reply to John from comment #51)
> Not sure if it is exactly related to this bug, but I can't play
> NeverWinterNight, I get a lot of blocs like these:
> 

What do you mean can't play?  Does it crash, or is nothing being rendered?
Comment 53 John 2014-11-11 01:38:00 UTC
It stays on a loading page, and I see those lines in my console.
The game does not freeze, it just "loads" forever.

Of course wine being what it is, maybe the main issue lies in wine and not the driver, but I see no special wine error on the console.
Comment 54 John 2014-11-21 10:24:46 UTC
Please forget my last 2 comments, I still get these messages in the terminal, so there's obviously an issue there, but I found some game workarounds that allow me to play the game, so I guess these messages are unrelated to my issue.

Thanks!
Comment 55 Tom Stellard 2015-01-07 22:10:10 UTC
I have this working for some simple cases, can you test these mesa and llvm trees together:

http://cgit.freedesktop.org/~tstellar/llvm/log/?h=vgpr-spilling-Jan07-2014

http://cgit.freedesktop.org/~tstellar/mesa/log/?h=vgpr-spilling-Jan07-2014
Comment 56 smoki 2015-01-08 07:48:03 UTC
(In reply to Tom Stellard from comment #55)
> I have this working for some simple cases, can you test these mesa and llvm
> trees together:
> 
> http://cgit.freedesktop.org/~tstellar/llvm/log/?h=vgpr-spilling-Jan07-2014
> 
> http://cgit.freedesktop.org/~tstellar/mesa/log/?h=vgpr-spilling-Jan07-2014

 I tried those for bug 79155 both medium/high global illumination now works fine there.
Comment 57 commiethebeastie 2015-01-09 20:18:51 UTC
Oh nice. +20% in Unigine and +30% in unreal 4.5 demos with no any image corruptions with vgpr-spilling-Jan07-2014 branch on my R9 280X.
Comment 58 Axel Davy 2015-01-13 19:12:59 UTC
I have tested perf-Jan-08-2015 tstellar branch,
and I get a failure to compile a Unigine heaven shader
under nine.
There is the same error when reverting the last patch of the branch (subreg liveness).

LLVM ERROR: Not supported instr: <MCInst 1908 <MCOperand Reg:3210> <MCOperand Imm:8> <MCOperand Reg:2044> <MCOperand Reg:74>>

1908 seems to correspond to SI_SPILL_V96_RESTORE

I'll include the TGSI shader, and next the llvm IR
Comment 59 Axel Davy 2015-01-13 19:15:59 UTC
Created attachment 112178 [details]
TGSI failing shader
Comment 60 Axel Davy 2015-01-13 19:18:11 UTC
Created attachment 112179 [details]
llvm failing shader
Comment 61 Tom Stellard 2015-02-05 15:52:24 UTC
The VGPR spilling patches have been pushed.  Is anyone still having problems with latest mesa and llvm from git?
Comment 62 darkbasic 2015-02-05 15:56:32 UTC
I am not able to do proper testing right now nor in the next couple of days, but I guess you can close this bug report and let peoples open a new one if they enncounter issues with a specific game.
Comment 63 Michel Dänzer 2015-02-06 01:35:43 UTC
Great plan. :) Awesome work, Tom!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.