- FullHD monitor (through HDMI KVM)
- HadesCanyon KBL i7-8809G ([AMD/ATI] Vega [Radeon RX Vega M] (rev c0))
- Ubuntu 18.04
- drm-tip git kernel v4.20-rc4 (i.e. kernel.org v4.20-rc4 kernel + latest drm code from yesterday)
- Mesa git (c120dbfe4d)
- X server git version
- Proprietary GfxBench v5-GOLD2: http://gfxbench.com
* bin/testfw_app --gfx vulkan --gl_api vulkan --width 1920 --height 1080 --fullscreen 1 --test_id vulkan_5_normal
* Works fine like the Aztec Ruins GL version and Sacha Willems' Vulkan tests, no GPU hangs
* Right after test starts, following in dmesg:
[ 3057.480868] amdgpu 0000:01:00.0: GPU fault detected: 146 0x0fa0880c for process testfw_app pid 2995 thread testfw_app pid 2997
[ 3057.480870] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x001001F4
[ 3057.480871] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0C08800C
[ 3057.480873] amdgpu 0000:01:00.0: VM fault (0x0c, vmid 6, pasid 32772) at page 1049076, read from 'TC4' (0x54433400) (136)
[ 3057.480879] amdgpu 0000:01:00.0: GPU fault detected: 146 0x0fa0840c for process testfw_app pid 2995 thread testfw_app pid 2997
[ 3057.480880] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x001001FD
[ 3057.480881] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0C08400C
[ 3057.480883] amdgpu 0000:01:00.0: VM fault (0x0c, vmid 6, pasid 32772) at page 1049085, read from 'TC5' (0x54433500) (132)
[ 3057.480944] amdgpu 0000:01:00.0: GPU fault detected: 146 0x0fa9080c for process testfw_app pid 2995 thread testfw_app pid 2997
[ 3057.480945] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000
[ 3057.480946] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0C18802C
[ 3057.480947] amdgpu 0000:01:00.0: VM fault (0x2c, vmid 6, pasid 32772) at page 0, read from 'TC0' (0x54433000) (392)
[ 3067.564630] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=53811, emitted seq=53814
[ 3067.564633] [drm] GPU recovery disabled.
After this, no other GPU operations seem to work properly. There are also other things that don't work properly in automated testing at this point, but I'm not sure whether they're related.
No idea whether this is a regression as I checked it only now. There are some issues with this particular test also on Intel (see e.g. bug 104634, bug 105276), so the problem could be in common code. No idea whether this is related to GL bug 108898 on same device.
Yes, this messes also other things, not just 3D (after this issue, script using pycurl to upload test results, will just sit in poll() instead of working, so I think something on kernel side gets corrupted).
The link is dead, if you have the demo can you upload it somewhere?
(In reply to Samuel Pitoiset from comment #2)
> The link is dead, if you have the demo can you upload it somewhere?
It still worked when I filed this (and has worked for years before). You can still get the page from Google cache:
As you can see, there isn't yet a public Linux version of GfxBench v5, only Android, iOS, MacOS and Windows versions.
And I naturally can't provide the proprietary version.
Doesn't Valve have licenses to industry standard 3D benchmarks (of which GfxBench is the main one on mobile, and as result, nowadays important also on desktop)?
If not, you could try using the Windows version with Wine, when the site works again. If Windows version supports Vulkan and Wine doesn't mangle its API calls for Linux, you could be able to trigger the issue (going through DX -> DXVK probably isn't good enough).
Or if there's some Linux Android container that passes Vulkan calls through, you could try the Android version:
Here's some extra info on the Aztec Ruins benchmark: