Created attachment 136677 [details] xorg log with the pageflip failures RX 480, radv master and amdvlk, xf86-video-amdgpu master This happens only with vkmark and only with kwin compositing enabled. With compton compositing it does not happen. When starting vkmark, X starts spamming these messages to the Xorg log: [ 128.815] (WW) AMDGPU(0): flip queue failed: Cannot allocate memory [ 128.815] (WW) AMDGPU(0): Page flip failed: Cannot allocate memory [ 128.815] (EE) AMDGPU(0): present flip failed and there are some synchronization artifacts on the screen. As soon as vkmark quits, everything is back to normal. When switching vkmark's present mode to immediate or fifo with vkmark -p immediate, this does not happen, only with the default (mailbox). Filed for DRM since it happens with amdvlk too.
Does this also happen with the modesetting driver instead of xf86-video-amdgpu?
This seems like related with Mesa OGL when doing the compositing present within the KWIN. AMDVLK actually did not allow flip in current release. From the attached log, the amdgpu ddx driver is being used.
Yes it also happens with modesetting. It is a bit different though as the messages switch to Device or resource busy very quickly [ 26.386] (WW) modeset(0): flip queue failed: Cannot allocate memory [ 26.386] (WW) modeset(0): Page flip failed: Cannot allocate memory [ 26.386] (EE) modeset(0): present flip failed [ 26.387] (WW) modeset(0): flip queue failed: Cannot allocate memory [ 26.387] (WW) modeset(0): Page flip failed: Cannot allocate memory [ 26.387] (EE) modeset(0): failed to set mode: Invalid argument [ 26.404] (WW) modeset(0): flip queue failed: Device or resource busy [ 26.404] (WW) modeset(0): Page flip failed: Device or resource busy [ 26.404] (EE) modeset(0): present flip failed [ 26.422] (WW) modeset(0): flip queue failed: Device or resource busy [ 26.422] (WW) modeset(0): Page flip failed: Device or resource busy also modesetting doesn't recover once vkmark quits, the messages keep spamming and X keeps being screwed up (feels laggy and some synchronization artifacts). Disabling kwin compositing then makes X freeze completely. But that may be unrelated since modesetting seems to have other issues too, like freezing on a black screen when exiting X or tty switching. It could indeed be an issue with kwin/radeonsi that just happens to be triggered by two different vulkan drivers with this one specific application in mailbox present mode.
I can't reproduce this, is it still happening for you? Is it with vkmark in fullscreen or windowed mode? Which version of kwin, and what's selected for rendering method and tearing prevention in kwin's compositor settings? Also, please make sure kwin isn't using the EGL backend.
Yes, it's still happening. It's just running vkmark in default mode. It looks like this: https://youtu.be/MngLl6BgOfg I'm not sure if previously it happened continuously but now it's only a couple of flips that fail (there's noticeable stutter visible when it happens). Not sure how to make sure it doesn't use EGL but I ran it with KWIN_OPENGL_INTERFACE=glx kwin_x11 --replace, I hope that's enough. Currently on kwin 5.12.5, xserver 1.19.99.905 g4191b59bd and latest xf86-video-amdgpu git so it has not been resolved there yet. It happens with kwin's OpenGL 2 and OpenGL 3 backends and all vsync methods, including None. It does not happen with XRender.
Thanks, I was finally able to reproduce this. Seems like it's only reproducible if the app runs at >= ~10K FPS. It happens because the client submits so many PresentPixmap requests within a single vertical blank period that the corresponding DRM vblank events fill up the kernel DRM event queue. Then when the compositor sends a PresentPixmap request, the page flip ioctl gets -ENOMEM from drm_event_reserve_init and propagates it. (That the modesetting driver can't deal with this is its bug) Options for addressing this: The kernel might allow queuing page flip events even when the queue is full with vblank events. The xserver present code should be able to re-use a single DRM vblank event for multiple client requests with the same target MSC.
On further thinking, this should be addressed in the xserver present code. Even if the kernel didn't return an error from the page flip ioctl, it still would from the vblank ioctls, which can currently result in some of the client's PresentPixmap requests being executed out of order (which might explain at least some of the intermittent artifacts).
https://cgit.freedesktop.org/xorg/driver/xf86-video-intel/tree/test/present-test.c
https://gitlab.freedesktop.org/xorg/xserver/issues/529
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.