Summary: | [r600g] GPU hang in 'gsraytrace' - NI/Turks (6670) | ||
---|---|---|---|
Product: | Mesa | Reporter: | Dieter Nützel <Dieter> |
Component: | Drivers/Gallium/r600 | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | Default DRI bug account <dri-devel> |
Severity: | critical | ||
Priority: | medium | ||
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=93706 | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
dmesg-4.2.0-1.gefc468a-desktop.log
gsraytrace_31672_00000000 dmesg-4.3.0-17.g6a48ac7-default.log glretrace_2862_00000000 apitrace gsraytrace.trace.xz |
Description
Dieter Nützel
2015-09-03 13:07:17 UTC
Even with R600_DEBUG=nosb mesa-demos/glsl> ./gsraytrace Gallium debugger active. The hang detection timout is 1000 ms. ATTENTION: default value of option vblank_mode overridden by environment. GL_RENDERER = Gallium 0.4 on AMD TURKS (DRM 2.43.0, LLVM 3.8.0) ESC = exit demo left mouse + drag = rotate camera dd: GPU hang detected! dd: Aborting the process... Abort OpenGL renderer string: Gallium 0.4 on AMD TURKS (DRM 2.43.0, LLVM 3.8.0) OpenGL core profile version string: 3.3 (Core Profile) Mesa 11.1.0-devel (git-6e37304) I'll attach dmesg-4.2.0-1.gefc468a-desktop.log gsraytrace_31672_00000000 (Marek's GREAT Gallium debugger log) [ 5676.604919] radeon 0000:01:00.0: ring 0 stalled for more than 31033msec [ 5676.604988] radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000a9cea last fence id 0x00000000000a9d18 on ring 0) [ 5676.647958] radeon 0000:01:00.0: Saved 1479 dwords of commands on ring 0. [ 5676.647983] radeon 0000:01:00.0: GPU softreset: 0x0000000D [ 5676.647986] radeon 0000:01:00.0: GRBM_STATUS = 0xF7631028 [ 5676.647988] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0xF8000002 [ 5676.647990] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 [ 5676.647992] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 [ 5676.647994] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [ 5676.647996] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [ 5676.647998] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x400C0000 [ 5676.648000] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00048002 [ 5676.648002] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80268647 [ 5676.648004] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44483106 [ 5676.664891] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B [ 5676.664945] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100100 [ 5676.666105] radeon 0000:01:00.0: GRBM_STATUS = 0x00003828 [ 5676.666107] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 [ 5676.666109] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 [ 5676.666111] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 [ 5676.666113] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [ 5676.666115] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [ 5676.666117] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 [ 5676.666119] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 [ 5676.666121] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 [ 5676.666124] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [ 5676.666158] radeon 0000:01:00.0: GPU reset succeeded, trying to resume [ 5676.689479] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 5676.692387] [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). [ 5676.692506] radeon 0000:01:00.0: WB enabled [ 5676.692509] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff88003723cc00 [ 5676.692510] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff88003723cc0c [ 5676.694278] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc90003432118 [ 5676.710865] [drm] ring test on 0 succeeded in 2 usecs [ 5676.710877] [drm] ring test on 3 succeeded in 7 usecs [ 5676.887988] [drm] ring test on 5 succeeded in 2 usecs [ 5676.887997] [drm] UVD initialized successfully. [ 5676.935630] [drm] ib test on ring 0 succeeded in 0 usecs [ 5676.935687] [drm] ib test on ring 3 succeeded in 0 usecs [ 5677.587054] [drm] ib test on ring 5 succeeded Created attachment 118067 [details]
dmesg-4.2.0-1.gefc468a-desktop.log
Created attachment 118068 [details]
gsraytrace_31672_00000000
dmesg snipped from another hang: [ 1361.853214] radeon 0000:01:00.0: ring 0 stalled for more than 10099msec [ 1361.853222] radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000b21b6 last fence id 0x00000000000b21ea on ring 0) [ 1361.873984] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait failed (-35). [ 1361.874010] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB on GFX ring (-35). [ 1361.921903] radeon 0000:01:00.0: GPU softreset: 0x00000009 [ 1361.921906] radeon 0000:01:00.0: GRBM_STATUS = 0xB2737828 [ 1361.921909] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x1E000007 [ 1361.921911] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 [ 1361.921913] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 [ 1361.921915] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [ 1361.921917] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [ 1361.921919] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x400C0000 [ 1361.921921] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00048006 [ 1361.921923] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80268647 [ 1361.921925] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [ 1361.922215] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B [ 1361.922270] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 [ 1361.923429] radeon 0000:01:00.0: GRBM_STATUS = 0x00003828 [ 1361.923431] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 [ 1361.923433] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 [ 1361.923435] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 [ 1361.923437] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [ 1361.923439] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [ 1361.923441] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 [ 1361.923443] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 [ 1361.923445] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 [ 1361.923448] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [ 1361.923480] radeon 0000:01:00.0: GPU reset succeeded, trying to resume [ 1361.946777] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 1361.949613] [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). [ 1361.949732] radeon 0000:01:00.0: WB enabled [ 1361.949734] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff88036f42ac00 [ 1361.949736] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff88036f42ac0c [ 1361.951504] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc90003432118 [ 1361.968087] [drm] ring test on 0 succeeded in 2 usecs [ 1361.968099] [drm] ring test on 3 succeeded in 7 usecs [ 1362.145206] [drm] ring test on 5 succeeded in 2 usecs [ 1362.145218] [drm] UVD initialized successfully. [ 1362.165411] [drm] ib test on ring 0 succeeded in 0 usecs [ 1362.165468] [drm] ib test on ring 3 succeeded in 0 usecs [ 1362.817284] [drm] ib test on ring 5 succeeded same bug with an amd radeon HD4650 pcie and r600 driver, archlinux 64 bits, OpenGL renderer string: Gallium 0.4 on AMD RV730 (DRM 2.43.0, LLVM 3.6.2) OpenGL core profile version string: 3.3 (Core Profile) Mesa 11.0.2 OpenGL core profile shading language version string: 3.30 OpenGL version string: 3.0 Mesa 11.0.2 OpenGL shading language version string: 1.30 OpenGL ES profile version string: OpenGL ES 3.0 Mesa 11.0.2 OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00 Created attachment 120157 [details]
dmesg-4.3.0-17.g6a48ac7-default.log
New log files.
apitrace, too.
Hope we get any further.
Created attachment 120158 [details]
glretrace_2862_00000000
Created attachment 120159 [details]
apitrace gsraytrace.trace.xz
Hurray! What an additional gift beside your impending new arrival, Dave ;-) This is finally fixed by this: commit 2239f3eaff5c72c4cb1d4a5be97feb4af3d08d25 Author: Dave Airlie <airlied@redhat.com> Date: Mon Nov 30 15:48:22 2015 +1000 r600/shader: emit tessellation factors to GDS at end of TCS. When we are finished the shader, we read back all the tess factors from LDS and write them to special global memory storage using GDS instructions. This also handles adding NOP when GDS or ENDLOOP end the TCS. Signed-off-by: Dave Airlie <airlied@redhat.com> :040000 040000 101f51186ea311e90fa8423ee772f2b1076737bf b01929ff47ca5035660b2c84ca2fdeb6604549fa M src Tomorrow, I'll test this on RV730 (AGP), too and CLOSE both when all goes smooth. (For the rendering issues with R600_DEBUG=nosb I'll open a new ticket.) Merry Christmas! And all the best for your family. OK, is this SOLVED by 'accident'? For RV730 GPU hang (Bug 83319) the above identified commit do NOT solve the hang. (Bug is updated, now.) The observed issues with R600_DEBUG=nosb for all three 'raytrace' variants (vsraytrace/fsraytrace/gsraytrace) stays. latest: Mesa 11.2.0-devel (git-6470435) I'll open a new ticket for this. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.