One more regression i forgot to fill bug about so to mention are GPU faults with The Talos Principle or Serious Sam 3... produce a lot of these right upon starting a game and continue... [ 823.723639] radeon 0000:00:01.0: GPU fault detected: 146 0x0006200c [ 823.723649] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 823.723652] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0602000C Mesa bisect goes to 860b658b97f859ee7d0dd076a8ac0332601ffa65 radeonsi: move clip plane constant buffer to RW buffers
Which GPU? Don't remember seeing those on Tonga.
Test is on Kabini APU (i have Bonaire and Kaveri so i might test that too), but someone on irc already mentioned it happens with amdgpu on Bonaire... so yup if it does not happen with Tonga it might be CIK related.
Created attachment 125277 [details] [review] Likely fix - RW_BUFFERS pointer is not written for LS stage Could you please try whether the attached patch fixes the problem for you?
No, still the same faults happens.
This works before 860b658b97f859ee7d0dd076a8ac0332601ffa65 So fixed.
I'm confused. You first wrote that 860b658b97f859ee7d0dd076a8ac0332601ffa65 is the commit which started the faults. Which one is it? Is Mesa master okay?
Why i closed this, have not idea... please ignore Comment 4 Yes Nicolai, bug is still there with current git of mesa and llvm.
(In reply to smoki from comment #2) > Test is on Kabini APU (i have Bonaire and Kaveri so i might test that too), > but someone on irc already mentioned it happens with amdgpu on Bonaire... so > yup if it does not happen with Tonga it might be CIK related. I have Kaveri iGPU and Talos Principle. I can try to run Talos Principle on Kaveri later this day (machine needs to be rebooted to enable the iGPU). Were you able to reproduce the issue on your Kaveri iGPU, or is it just reproducible on Kabini?
(In reply to Jan Ziak from comment #8) > (In reply to smoki from comment #2) > > Test is on Kabini APU (i have Bonaire and Kaveri so i might test that too), > > but someone on irc already mentioned it happens with amdgpu on Bonaire... so > > yup if it does not happen with Tonga it might be CIK related. > > I have Kaveri iGPU and Talos Principle. I can try to run Talos Principle on > Kaveri later this day (machine needs to be rebooted to enable the iGPU). > > Were you able to reproduce the issue on your Kaveri iGPU, or is it just > reproducible on Kabini? "Talos Principle [publicbeta]" runs fine on my machine's Kaveri iGPU Mesa 12.1.0-devel (git-e988999) Linux 4.8.0-rc2, amdgpu.ko
Hmmm, do you have any GPU faults with it? Check dmesg. As this is what is this bug about, those two games runs but introduce constant GPU faults.
(In reply to smoki from comment #10) > Hmmm, do you have any GPU faults with it? Check dmesg. > > As this is what is this bug about, those two games runs but introduce > constant GPU faults. You are right: [18409.862642] VM fault (0x02, vmid 5) at page 0, read from 'TC4' (0x54433400) (136) [18409.917406] amdgpu 0000:01:00.0: GPU fault detected: 147 0x000a8802 [18409.917409] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [18409.917410] amdgpu 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0A088002 The GPU is R9 390.
Exactly, i have those faults with Kabini, Kaveri and Bonaire using radeon. And you have it with Kaveri and Hawaii (Grenada) using amdgpu. So for now reported only GCN 1.1 is affected by this and regardless of kernel driver used.
Created attachment 125846 [details] apitrace The last glDrawArrays in the trace prints a VM fault to dmesg.
Thanks. I know where the problem is and I'm working on it. Just FYI, the VM fault is completely harmless.
It is not harmless, as lower end machine is (APU) then you even lost sort of 20% with these constant non harmeless messAgess on top of already low perf. Also i obesrved one random GPU lockup if continue playing Serious Sam 3 with it, but let just say that is separate issue for now.
(In reply to smoki from comment #15) > It is not harmless, as lower end machine is (APU) then you even lost sort > of 20% with these constant non harmless messages on top of already low > perf. Of course, I (and probably also you) will do a performance measurement after Marek's patch is available. (There is lot of work to be done in Mesa to make it perform generally better and approach Nvidia Linux performance. Considering the amount of work required I do not expect it to materialize this year.)
I already do measurement month ago, when i bisected this. I would like to have 2core Temash APU to show even bigger issue that is i think highly recommended for perf measurements, but Kabini is slowest i have :D If i don't care i wouldn't be here, otherwise i would run Pascal Titan X with blob and be as ignorant as i can. But no i am not, not me. Here i actually don't pretend or push on higher perf at all, but opposite - just things to not regress this much... so it is small regression testing contribution from me.
Fixed by 2c13abb49137d0f81b530b3c67f1ed79c58c796e. Closing.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.