I tried it on two computers. Linux (none) 4.20.0-rc1+ #8 SMP PREEMPT Tue Nov 20 00:24:49 CET 2018 x86_64 AMD Athlon PRO 200GE w/ Radeon Vega Graphics AuthenticAMD GNU/Linux Extended renderer info (GLX_MESA_query_renderer): Vendor: X.Org (0x1002) Device: AMD RAVEN (DRM 3.27.0, 4.20.0-rc1+, LLVM 7.0.0) (0x15dd) Version: 18.2.5 [ 80.221112] amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:32 vmid:2 pasid:32768, for process roles pid 358 thread roles:cs0 pid 359) [ 80.221116] amdgpu 0000:38:00.0: in page starting at address 0x0000800000a94000 from 27 [ 80.221118] amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00240C40 Other computer. Linux amd1.blue.org 4.19.2 #1 SMP PREEMPT Tue Nov 20 21:41:52 CET 2018 x86_64 AMD Ryzen 7 1700X Eight-Core Processor AuthenticAMD GNU/Linux Extended renderer info (GLX_MESA_query_renderer): Vendor: X.Org (0x1002) Device: AMD Radeon (TM) RX 460 Graphics (POLARIS11, DRM 3.27.0, 4.19.2, LLVM 7.0.0) (0x67ef) Version: 18.2.5 [ 1253.329906] amdgpu 0000:0e:00.0: GPU fault detected: 147 0x09004802 for process roles pid 1119 thread roles:cs0 pid 1120 [ 1253.329910] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0000EB20 [ 1253.329911] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0C048002 [ 1253.329914] amdgpu 0000:0e:00.0: VM fault (0x02, vmid 6, pasid 32769) at page 60192, read from 'TC0' (0x54433000) (72) Is this llvm or mesa issue ? I also tried older kernel 4.16 same thing. What reports do you need ?
Created attachment 142535 [details] umr dump
Created attachment 142536 [details] gallium dump t1
Created attachment 142537 [details] gallium dump t0
Created attachment 142538 [details] trace events amdgpu
Attached logs [ 332.004841] amdgpu 0000:0e:00.0: GPU fault detected: 147 0x0f800802 for process roles pid 1043 thread roles:cs0 pid 1044 [ 332.004844] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x000EA1F0 [ 332.004845] amdgpu 0000:0e:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x04008002 [ 332.004848] amdgpu 0000:0e:00.0: VM fault (0x02, vmid 2, pasid 32769) at page 958960, read from 'TC2' (0x54433200) (8)
Created attachment 142598 [details] another gallium dump another dump, tried with propriery nvidia drivers. it works fine there.
Looks like sctx->bindless_descriptors->gpu_address is not accessable by gpu. 2e00000 is not in buffer list. c0017600 SET_SH_REG: 0000014d 02e00000 SPI_SHADER_USER_DATA_COMMON_1 <- 0x02e00000 [ 174.469016] amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:32 vmid:2 pasid:32769, for process roles pid 398 thread roles:cs0 pid 399) [ 174.469021] amdgpu 0000:38:00.0: in page starting at address 0x0000800002e04000 from 27 [ 174.469023] amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00240C40 [ 184.763074] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=583, emitted seq=585
Well this is bug when using bindless textures and framebuffer which is also resident in bindless textures. There is no more fault if i comment out si_upload_bindless_descriptor function. radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 2 + num_dwords, 0)); radeon_emit(cs, S_370_DST_SEL(V_370_TC_L2) | S_370_WR_CONFIRM(1) | S_370_ENGINE_SEL(V_370_ME)); radeon_emit(cs, va); radeon_emit(cs, va >> 32);
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.