Bug 109364

Summary: switching vsync on and off in vkquake breaks TearFree
Product: xorg Reporter: tempel.julian
Component: Driver/AMDgpuAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
file needed to reproduce bug
none
xorg log
none
new xorg log
none
xorg log with debug information
none
amdgpu.dc=0 xorg log
none
latest xorg log
none
drm handle error log none

Description tempel.julian 2019-01-15 15:21:56 UTC
Created attachment 143128 [details]
file needed to reproduce bug

Start vkquake, go into the options and change video settings to native resolution and fullscreen, if not already the case.
Then quit the game, make sure TearFree is enabled and then start the game again.
Then turn the game's vsync feature on, apply the change, turn it off again and also apply this change.
-> TearFree is now disabled or broken (tears quite badly), also after closing the game.

Get vkquake here:
https://github.com/Novum/vkQuake
You need to put the "id1" folder from the game's shareware version into your home folder, which can be legally obtained from here (see also description at the GitHub repo):
bit.ly/2aDMSiz
Since this link is usually or always not reachable, I'll attach the file to this ticket as encrypted archive (password: "tearfree") and will delete it later, just in case.

Tested with latest
drm-next-5.1-wip
xf86-video-amdgpu-git 2058c4c4
mesa-git 19.0.0_devel.106811.7bef192018 (radv)
Comment 1 Michel Dänzer 2019-01-15 16:03:40 UTC
Please attach the Xorg log file, captured after reproducing the problem.
Comment 2 tempel.julian 2019-01-15 19:23:03 UTC
Created attachment 143136 [details]
xorg log
Comment 3 tempel.julian 2019-01-15 19:23:15 UTC
Here we go.
Comment 5 tempel.julian 2019-01-16 13:52:21 UTC
Situation seems to be unchanged with the PR.
Comment 6 Michel Dänzer 2019-01-16 14:46:42 UTC
Please attach a new log file from reproducing the problem with the patch applied.
Comment 7 tempel.julian 2019-01-16 15:23:50 UTC
Created attachment 143142 [details]
new xorg log
Comment 8 Michel Dänzer 2019-01-16 17:10:47 UTC
Please double-check that /usr/lib/xorg/modules/drivers/amdgpu_drv.so was compiled from the bugzilla-109364 branch in my xf86-video-amdgpu repository. While I'm unable to reproduce the problem with vkQuake, I was able to reproduce at least the first 3 messages with warzone2100, but not anymore with that change.
Comment 9 tempel.julian 2019-01-16 19:30:47 UTC
The last time I patched the PR into the main branch.
This time I directly cloned the specific bugzilla-109364 branch. TearFree unfortunately still breaks with it. It even makes the game's vsync not work correctly anymore, there is tearing then after switching it (the game's own vsync, not TearFree) on.
Comment 10 Michel Dänzer 2019-01-17 09:09:10 UTC
(In reply to tempel.julian from comment #9)
> It even makes the game's vsync not work correctly anymore, there is tearing
> then after switching it (the game's own vsync, not TearFree) on.

You know the drill. :) (Please attach the Xorg log file captured after reproducing these new symptoms)
Comment 11 tempel.julian 2019-01-17 10:04:57 UTC
I meant that this was also the case from the beginning, without the recent changes for this ticket (so case already covered by the existing logs).
I'll gladly test any further patches, but apart from that, I guess I can't provide anything more for this issue.
Comment 12 Michel Dänzer 2019-01-17 13:51:29 UTC
I rebased the branch on current master and expanded the fix, please re-test. If it still happens, please check the log file and attach it again if the TearFree / flip related warning/error messages changed.
Comment 13 tempel.julian 2019-01-17 17:28:03 UTC
It unfortunately still breaks with the same error reported in the log.

Btw: Did you test with radv? I once had similar issues only with TearFree + radv, but not amdvlk. I'll give the latter one a try again.
Comment 14 Michel Dänzer 2019-01-18 11:04:08 UTC
I'm testing with RADV.

I pushed a debugging patch to the branch, please reproduce the problem with that and attach the resulting log file.

With Git master, can you also reproduce the first 3 log messages with warzone2100, by toggling "Vertical sync" under Video Options? If so, does the branch also fix this for you?
Comment 15 tempel.julian 2019-01-18 12:05:48 UTC
The warzone2100 build provided by Arch unfortunately doesn't provide a Vulkan renderer, or do you want me to test it with OGL?
Comment 16 tempel.julian 2019-01-18 13:42:58 UTC
Created attachment 143149 [details]
xorg log with debug information
Comment 17 Michel Dänzer 2019-01-18 17:51:11 UTC
(In reply to tempel.julian from comment #15)
> The warzone2100 build provided by Arch unfortunately doesn't provide a
> Vulkan renderer, or do you want me to test it with OGL?

Yes, I doubt there is a Vulkan renderer, and it doesn't matter which application API triggers the bug(s).

I updated the debugging patch in the branch, please attach the log file from reproducing with that.
Comment 18 tempel.julian 2019-01-18 21:02:30 UTC
I tried with warzone2100, but for some reason the vsync setting in the options menu became unclickable after clicking it once.

So I tried the latest PR update with vkquake and to me it seems the errors logged haven't changed:

[    69.971] (WW) AMDGPU(0): flip queue failed: Device or resource busy, flip_pending=(nil)
[    69.971] (WW) AMDGPU(0): flipdata->fb[0]=0x56264548b8e0
[    69.971] (WW) AMDGPU(0): flipdata->fb[1]=(nil)
[    69.971] (WW) AMDGPU(0): flipdata->fb[2]=(nil)
[    69.971] (WW) AMDGPU(0): flipdata->fb[3]=(nil)
[    69.971] (WW) AMDGPU(0): flipdata->fb[4]=(nil)
[    69.971] (WW) AMDGPU(0): flipdata->fb[5]=(nil)
[    69.971] (WW) AMDGPU(0): scanout fb 0 = 0x56264548b8e0
[    69.971] (WW) AMDGPU(0): scanout fb 1 = 0x5626454f5d20
[    69.971] (WW) AMDGPU(0): Page flip failed: Device or resource busy
[    69.971] (EE) AMDGPU(0): present flip failed
[    69.971] (WW) AMDGPU(0): flip queue failed in amdgpu_scanout_flip: Device or resource busy, TearFree inactive
[    69.971] (WW) AMDGPU(0): flip_pending=(nil), scanout fbs = 0x56264548b8e0 0x5626454f5d20
Comment 19 Darek 2019-01-20 19:05:21 UTC
(In reply to tempel.julian from comment #15)
> The warzone2100 build provided by Arch......

Maybe this bug is somehow related to this [1]?

[1] https://bugs.archlinux.org/task/58782

Have you tried xorg-server with autotools?
Comment 20 tempel.julian 2019-01-21 18:08:21 UTC
Thank you for the hint.
I tried building xorg with autotools, but the situation is unchanged for me.
Comment 21 Michel Dänzer 2019-01-21 18:14:06 UTC
(In reply to tempel.julian from comment #18)
> So I tried the latest PR update with vkquake and to me it seems the errors
> logged haven't changed:

That's bad news, as it means I'm running out of ideas how this condition might occur. :(

BTW, does it also happen with amdgpu.dc=0?
Comment 22 tempel.julian 2019-01-21 19:39:37 UTC
Yes, I think one of your changes introduced a difference between amdgpu.dc=1 and 0 in that regard:
With amdgpu.dc=0, TearFree remains enabled after switching vsync in vkquake on and off and then exiting the game. It just breaks the game's vsync when turning it from off to on, then there is tearing despite of the fps being locked to the refresh rate. Moving windows is still free of tearing after closing the game, unlike with amdgpu.dc=1. When I start vkquake again when I left the game's vsync enabled, it works without issues without having to re-enable TearFree again.

With amdgpu.dc=1, TearFree breaks "for good" after switching the game's vsync option, I have to re-enable TearFree then.

If we can't manage to find the root of the problem, perhaps automatically re-initializing TearFree after it detects failure would be a viable workaround?

Btw: It may be "helpful" to switch the vsync setting several times in vkquake to provoke the problem, not just once. The game also limits the fps by default to 72, I suggest using a value well above refresh rate (e.g. at least "host_maxfps 80" via the game's console or config for 75Hz).
Comment 23 tempel.julian 2019-01-21 19:46:55 UTC
Ok, the difference between amdgpu.dc=1 & 0 is also there with normal git-master of xf86-video-amdgpu.
Comment 24 Michel Dänzer 2019-01-22 09:42:52 UTC
(In reply to tempel.julian from comment #22)
> If we can't manage to find the root of the problem, perhaps automatically
> re-initializing TearFree after it detects failure would be a viable
> workaround?

I'd like to get to the bottom of these issues first, then I'll think about possible mitigation of similar issues in the future.

Please attach a log file from reproducing the problem with amdgpu.dc=0.


> Btw: It may be "helpful" to switch the vsync setting several times in
> vkquake to provoke the problem, not just once.

I've tried it several times.


> The game also limits the fps by default to 72, I suggest using a value well
> above refresh rate (e.g. at least "host_maxfps 80" via the game's console or
> config for 75Hz).

Are you saying you can't reproduce the problem with lower values?
Comment 25 tempel.julian 2019-01-22 11:27:57 UTC
Ok, it doesn't seem to matter if fps are capped below refreshrate by the game's fps limiter. However, it's easier to spot tearing when there is no judder due to repeated frames.

With amdgpu.dc=0 and TearFree not getting killed entirely by the vsync switch, the log is getting much longer. Not sure if it contains anything helpful though.
Comment 26 tempel.julian 2019-01-22 11:28:24 UTC
Created attachment 143198 [details]
amdgpu.dc=0 xorg log
Comment 27 Michel Dänzer 2019-01-22 17:39:34 UTC
Had another idea what could cause the problem. Branch updated, please re-test and attach the resulting log file (regardless of whether the problem still occurs).
Comment 28 tempel.julian 2019-01-22 18:03:15 UTC
I tried with amdgppu.dc=1 & 0 once each time, and it looks like you did it, no anomalies showed up. Yey! :)
Will test it a few more times to be sure.
Comment 29 tempel.julian 2019-01-22 18:03:39 UTC
Created attachment 143203 [details]
latest xorg log
Comment 30 tempel.julian 2019-01-22 18:32:14 UTC
Definitely fixed, thanks. I'd say TearFree works flawlessly here now.
Comment 31 Michel Dänzer 2019-01-24 17:44:23 UTC
I updated the branch with the final fixes for review, please test that it still fixes the problem. If any log message about drmHandleEvent appears during testing, please provide it.
Comment 32 tempel.julian 2019-01-24 18:11:58 UTC
It breaks again with the mentioned error in the log.
Comment 33 tempel.julian 2019-01-24 18:12:26 UTC
Created attachment 143226 [details]
drm handle error log
Comment 34 Michel Dänzer 2019-01-25 09:16:28 UTC
Ugh yeah, had a brain fart in patch 2. :} Fixed up now, please try again.
Comment 35 tempel.julian 2019-01-25 10:05:14 UTC
All fine again.
Comment 36 Michel Dänzer 2019-01-25 17:01:30 UTC
Fixes merged, thanks for the report and testing!
Comment 37 tempel.julian 2019-01-25 17:40:38 UTC
Really appreciate how well issues are sorted out for the xf86 DDX driver.
It'd be nice if the next release wasn't very far away, as there seem to be a lot of really helpful fixes included.
Comment 38 Michel Dänzer 2019-01-28 10:00:02 UTC
According to the 6-month release cycle, the 19.0 release is expected around March.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.