Setup: - FullHD monitor (through HDMI KVM) - HadesCanyon KBL i7-8809G ([AMD/ATI] Vega [Radeon RX Vega M] (rev c0)) - Ubuntu 18.04 - drm-tip git kernel v4.20-rc4 (i.e. kernel.org v4.20-rc4 kernel + latest drm code from yesterday) - Mesa git (c120dbfe4d) with AMD VEGAM renderer - X server git version - Proprietary GfxBench v5, but public GfxBench v4 should have same tests: http://gfxbench.com Test-cases: * Manhattan 3.0 offscreen: bin/testfw_app --gfx glfw --gl_api desktop_core --width 1920 --height 1080 --fullscreen 1 --test_id gl_manhattan_off * Manhattan 3.1 onscreen: bin/testfw_app --gfx glfw --gl_api desktop_core --width 1920 --height 1080 --fullscreen 1 --test_id gl_manhattan31 Expected outcome: * No GPU timeouts Actual outcome: * 1 out of 3 runs gives in dmesg: [ 2817.689624] [drm:drm_sched_job_timedout [gpu_sched]] *ERROR* ring gfx timeout, but soft recovered NOTE: These were happening already when we started testing this machine in mid October, with Mesa 18cc65edf8480 & drm-tip kernel v4.19-rc8.
Hangs are still happening with the latest Mesa (a203eaa4f4fb) and drm-tip kernel (v5.0-rc4) git versions: [ 2776.782754] Iteration 3/3: bin/testfw_app --gfx glfw --gl_api desktop_core --width 1920 --height 1080 --fullscreen 1 --test_id gl_manhattan_off [ 2845.656793] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered [ 2845.836983] Iteration 1/3: testfw_app --gfx glfw --gl_api desktop_core --width 1920 --height 1080 --fullscreen 1 --test_id gl_manhattan31 [ 2915.288863] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered [ 2915.383696] Iteration 2/3: testfw_app --gfx glfw --gl_api desktop_core --width 1920 --height 1080 --fullscreen 1 --test_id gl_manhattan31 [ 2980.104777] Iteration 3/3: bin/testfw_app --gfx glfw --gl_api desktop_core --width 1920 --height 1080 --fullscreen 1 --test_id gl_manhattan31 [ 3044.823739] Iteration 1/3: bin/testfw_app --gfx glfw --gl_api desktop_core --width 1920 --height 1080 --fullscreen 1 --test_id gl_manhattan31_off [ 3113.432727] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered [ 3113.528273] Iteration 2/3: bin/testfw_app --gfx glfw --gl_api desktop_core --width 1920 --height 1080 --fullscreen 1 --test_id gl_manhattan31_off
Hangs are still happening with the latest Mesa (43f40dc7cb234e) and drm-tip kernel (v5.0) git versions in Manhattan test offscreen versions.
(In reply to Eero Tamminen from comment #2) > Hangs are still happening with the latest Mesa (43f40dc7cb234e) and drm-tip > kernel (v5.0) git versions in Manhattan test offscreen versions. Hangs still continue with latest Mesa & drm-tip kernel. Public version of GfxBench v4 has these same tests: https://gfxbench.com/result.jsp?benchmark=gfx40 https://gfxbench.com/linux-download/ (It just doesn't support automating their running from command line.)
Any updates on this (VegaM) bug? These recoverable hangs are still happening with git versions of kernel, Mesa and linux-firmware. Unlike with the hard hang bug 108900, this test-case is freely available.
Sometimes there's also another error message, about fences: [ 5813.444709] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! [ 5818.564819] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
These still happen with latest git version of kernel, Mesa etc.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1343.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.