Bug 59588

Summary: llvm rv790 etqw gpu lock since r600g/llvm: tgsi to llvm emits store.swizzle intrinsic for vs/fs output
Product: Mesa Reporter: Andy Furniss <adf.lists>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: vljn
Version: git   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: compressed etqw shader dump while getting gpu lock.
Disable llvm fs
compressed etqw shaders working with patch
Add dummy vs export

Description Andy Furniss 2013-01-19 16:50:32 UTC
PCIE HD4890 drm-fixes kernel, current tstellar llvm

Since Mesa commit

commit 3b14ce2cafea03de1b39e44cc8c37439b031e3eb
Author: Vincent Lejeune <vljn@ovi.com>
Date:   Fri Jan 11 19:48:29 2013 +0100

    r600g/llvm: tgsi to llvm emits store.swizzle intrinsic for vs/fs output

Enemy territory quake wars locks GPU as soon as it loads.

Doesn't happen with R600_LLVM=0, doesn't happen on the commit before this.

Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000041577 last fence id 0x0000000000041575)
Jan 19 16:34:11 nf7 kernel: [drm] Disabling audio support
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0: Saved 89 dwords of commands on ring 0.
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0: GPU softreset: 0x00000007
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008010_GRBM_STATUS      = 0xA23304A0
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008014_GRBM_STATUS2     = 0x00000002
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_000E50_SRBM_STATUS      = 0x200000C0
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x01000000
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00011002
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00028584
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80838647
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008020_GRBM_SOFT_RESET=0x00007FEE
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00000001
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008010_GRBM_STATUS      = 0x00003028
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008014_GRBM_STATUS2     = 0x00000002
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_000E50_SRBM_STATUS      = 0x200000C0
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume
Jan 19 16:34:11 nf7 kernel: [drm] probing gen 2 caps for device 1022:9603 = 2/0
Jan 19 16:34:11 nf7 kernel: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
Jan 19 16:34:11 nf7 kernel: [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0: WB enabled
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xff95dc00
Jan 19 16:34:11 nf7 kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xff95dc0c
Jan 19 16:34:11 nf7 kernel: [drm] ring test on 0 succeeded in 1 usecs
Jan 19 16:34:11 nf7 kernel: [drm] ring test on 3 succeeded in 1 usecs
Jan 19 16:34:11 nf7 kernel: [drm] Enabling audio support
Jan 19 16:34:11 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:337 HDMI: ELD buf size is 0, force 128
Jan 19 16:34:11 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:356 HDMI: invalid ELD data byte 0
Jan 19 16:34:11 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:337 HDMI: ELD buf size is 0, force 128
Jan 19 16:34:11 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:356 HDMI: invalid ELD data byte 0
Jan 19 16:34:12 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:337 HDMI: ELD buf size is 0, force 128
Jan 19 16:34:12 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:356 HDMI: invalid ELD data byte 0
Jan 19 16:34:12 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:337 HDMI: ELD buf size is 0, force 128
Jan 19 16:34:12 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:356 HDMI: invalid ELD data byte 0
Jan 19 16:34:12 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:337 HDMI: ELD buf size is 0, force 128
Jan 19 16:34:12 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:356 HDMI: invalid ELD data byte 0
Jan 19 16:34:13 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:337 HDMI: ELD buf size is 0, force 128
Jan 19 16:34:13 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:356 HDMI: invalid ELD data byte 0
Jan 19 16:34:13 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:337 HDMI: ELD buf size is 0, force 128
Jan 19 16:34:13 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:356 HDMI: invalid ELD data byte 0
Jan 19 16:34:13 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:337 HDMI: ELD buf size is 0, force 128
Jan 19 16:34:13 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:356 HDMI: invalid ELD data byte 0
Jan 19 16:34:14 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:337 HDMI: ELD buf size is 0, force 128
Jan 19 16:34:14 nf7 kernel: ALSA sound/pci/hda/hda_eld.c:356 HDMI: invalid ELD data byte 0
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000041579 last fence id 0x0000000000041577)
Jan 19 16:34:22 nf7 kernel: [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35).
Jan 19 16:34:22 nf7 kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35).
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0: ib ring test failed (-35).
Jan 19 16:34:22 nf7 kernel: [drm] Disabling audio support
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0: GPU softreset: 0x00000007
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008010_GRBM_STATUS      = 0xA23304A0
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008014_GRBM_STATUS2     = 0x00000002
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_000E50_SRBM_STATUS      = 0x200000C0
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x01000000
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00011002
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00028584
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80838647
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008020_GRBM_SOFT_RESET=0x00007FEE
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00000001
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008010_GRBM_STATUS      = 0x00003028
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008014_GRBM_STATUS2     = 0x00000002
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_000E50_SRBM_STATUS      = 0x200000C0
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume
Jan 19 16:34:22 nf7 kernel: [drm] probing gen 2 caps for device 1022:9603 = 2/0
Jan 19 16:34:22 nf7 kernel: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
Jan 19 16:34:22 nf7 kernel: [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0: WB enabled
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xff95dc00
Jan 19 16:34:22 nf7 kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xff95dc0c
Jan 19 16:34:22 nf7 kernel: [drm] ring test on 0 succeeded in 1 usecs
Jan 19 16:34:22 nf7 kernel: [drm] ring test on 3 succeeded in 1 usecs
Jan 19 16:34:22 nf7 kernel: [drm] Enabling audio support
Jan 19 16:34:22 nf7 kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Jan 19 16:34:22 nf7 kernel: [drm] ib test on ring 3 succeeded in 1 usecs
Comment 1 vincent 2013-01-21 13:33:41 UTC
Can you run etqw with R600_DUMP_SHADERS=1 and post the output please ?
Comment 2 Andy Furniss 2013-01-21 16:37:02 UTC
Created attachment 73391 [details]
compressed etqw shader dump while getting gpu lock.
Comment 3 vincent 2013-01-21 21:45:06 UTC
As far as I can tell, all shaders end with an export instruction, with EndOfProgram bit set. I suspect an issue with number of color buffer export involved.

Can you apply this patch and report if the game still locks the gpu ?
Comment 4 vincent 2013-01-21 21:46:27 UTC
Created attachment 73411 [details] [review]
Disable llvm fs
Comment 5 Andy Furniss 2013-01-21 23:24:12 UTC
(In reply to comment #3)
> As far as I can tell, all shaders end with an export instruction, with
> EndOfProgram bit set. I suspect an issue with number of color buffer export
> involved.
> 
> Can you apply this patch and report if the game still locks the gpu ?

The game runs OK with the patch.
Comment 6 vincent 2013-01-22 18:21:30 UTC
Can you send a log with the same env var set so that I can diff working and non working log please ?
Comment 7 Andy Furniss 2013-01-23 00:05:01 UTC
Created attachment 73485 [details]
compressed etqw shaders working with patch
Comment 8 Andy Furniss 2013-01-23 00:11:23 UTC
The patch also fixes the minor issue I reported with some mesa demos.

https://bugs.freedesktop.org/show_bug.cgi?id=58150
Comment 9 Alex Deucher 2013-01-23 13:45:32 UTC
Regarding the conversation on IRC, the vertex shader has to export at least one generic param (not counting special exports like position).  So if the vertex shader doesn't export any params, you need a dummy one.
Comment 10 vincent 2013-01-23 16:03:07 UTC
Can you test with this new patch ? (Remove the previous one)
It adds dummy export to vs outputs
Comment 11 vincent 2013-01-23 16:03:36 UTC
Created attachment 73532 [details] [review]
Add dummy vs export
Comment 12 Andy Furniss 2013-01-23 21:23:09 UTC
(In reply to comment #10)
> Can you test with this new patch ? (Remove the previous one)
> It adds dummy export to vs outputs

It works OK with the new patch.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.