Sometimes the spilling logic will decide to spill a merged def. This is fine for 2- and 4-wide values as there are appropriate load/store instructions for those, but not so with 3-wide values. We must take that into account in the SpillCodeInserter and emit a 64-bit and 32-bit store into lmem (and similar for load, although less important, as it tends to have gotten split up by the time the use rolls around).
Unfortunately I've lost track of the program that repro'd this issue.
Created attachment 115607 [details]
shader causing spill failure
Compile with -a e4 to get the failure. Works on GK110 since that has 256 registers.
This is what ultimately causes the failure at emit time:
140: texfetch 2D $r0 $s0 f32 $r48t $r48q (8)
141: texbar - # $r48t (8)
142: st b96 # l[0x0] $r48t (8)
codegen/nv50_ir_emit_nvc0.cpp:1659:emitLoadStoreType: Assertion `!"invalid type"' failed.
This is a preliminary patch to resolve the issue that I wrote a while back:
Note that it needs a counterpart unspill variant too (although that's not hit by this shader). And ideally it'd be smart enough to go for a 64 + 32 store rather than 3x 32, but that's probably too much -- split then re-merge. (Or can split take unevenly sized defs? That seems like asking for trouble.)
I've pushed out a version of this patch. Should be included in 11.0.x and the upcoming 11.1 release.