Bug 104598 - vkmark with kwin compositing: Page flip failed: Cannot allocate memory
Summary: vkmark with kwin compositing: Page flip failed: Cannot allocate memory
Status: RESOLVED MOVED
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/General (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Xorg Project Team
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-12 11:44 UTC by Christoph Haag
Modified: 2018-12-17 17:35 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
xorg log with the pageflip failures (128.17 KB, text/plain)
2018-01-12 11:48 UTC, Christoph Haag
no flags Details

Description Christoph Haag 2018-01-12 11:44:22 UTC

    
Comment 1 Christoph Haag 2018-01-12 11:48:42 UTC
Created attachment 136677 [details]
xorg log with the pageflip failures

RX 480, radv master and amdvlk, xf86-video-amdgpu master

This happens only with vkmark and only with kwin compositing enabled. With compton compositing it does not happen.

When starting vkmark, X starts spamming these messages to the Xorg log:

[   128.815] (WW) AMDGPU(0): flip queue failed: Cannot allocate memory
[   128.815] (WW) AMDGPU(0): Page flip failed: Cannot allocate memory
[   128.815] (EE) AMDGPU(0): present flip failed

and there are some synchronization artifacts on the screen. As soon as vkmark quits, everything is back to normal.

When switching vkmark's present mode to immediate or fifo with vkmark -p immediate, this does not happen, only with the default (mailbox).

Filed for DRM since it happens with amdvlk too.
Comment 2 Michel Dänzer 2018-01-12 15:15:26 UTC
Does this also happen with the modesetting driver instead of xf86-video-amdgpu?
Comment 3 David Mao 2018-01-12 15:21:34 UTC
This seems like related with Mesa OGL when doing the compositing present within the KWIN.
AMDVLK actually did not allow flip in current release.
From the attached log, the amdgpu ddx driver is being used.
Comment 4 Christoph Haag 2018-01-12 15:53:41 UTC
Yes it also happens with modesetting. It is a bit different though as the messages switch to Device or resource busy very quickly

[    26.386] (WW) modeset(0): flip queue failed: Cannot allocate memory
[    26.386] (WW) modeset(0): Page flip failed: Cannot allocate memory
[    26.386] (EE) modeset(0): present flip failed
[    26.387] (WW) modeset(0): flip queue failed: Cannot allocate memory
[    26.387] (WW) modeset(0): Page flip failed: Cannot allocate memory
[    26.387] (EE) modeset(0): failed to set mode: Invalid argument
[    26.404] (WW) modeset(0): flip queue failed: Device or resource busy
[    26.404] (WW) modeset(0): Page flip failed: Device or resource busy
[    26.404] (EE) modeset(0): present flip failed
[    26.422] (WW) modeset(0): flip queue failed: Device or resource busy
[    26.422] (WW) modeset(0): Page flip failed: Device or resource busy

also modesetting doesn't recover once vkmark quits, the messages keep spamming and X keeps being screwed up (feels laggy and some synchronization artifacts).
Disabling kwin compositing then makes X freeze completely. But that may be unrelated since modesetting seems to have other issues too, like freezing on a black screen when exiting X or tty switching.


It could indeed be an issue with kwin/radeonsi that just happens to be triggered by two different vulkan drivers with this one specific application in mailbox present mode.
Comment 5 Michel Dänzer 2018-04-27 16:39:17 UTC
I can't reproduce this, is it still happening for you? Is it with vkmark in fullscreen or windowed mode? Which version of kwin, and what's selected for rendering method and tearing prevention in kwin's compositor settings? Also, please make sure kwin isn't using the EGL backend.
Comment 6 Christoph Haag 2018-05-14 08:21:51 UTC
Yes, it's still happening. It's just running vkmark in default mode. It looks like this: https://youtu.be/MngLl6BgOfg

I'm not sure if previously it happened continuously but now it's only a couple of flips that fail (there's noticeable stutter visible when it happens).

Not sure how to make sure it doesn't use EGL but I ran it with KWIN_OPENGL_INTERFACE=glx kwin_x11 --replace, I hope that's enough.

Currently on kwin 5.12.5, xserver 1.19.99.905 g4191b59bd and latest xf86-video-amdgpu git so it has not been resolved there yet.

It happens with kwin's OpenGL 2 and OpenGL 3 backends and all vsync methods, including None.
It does not happen with XRender.
Comment 7 Michel Dänzer 2018-05-15 16:18:57 UTC
Thanks, I was finally able to reproduce this. Seems like it's only reproducible if the app runs at >= ~10K FPS.

It happens because the client submits so many PresentPixmap requests within a single vertical blank period that the corresponding DRM vblank events fill up the kernel DRM event queue. Then when the compositor sends a PresentPixmap request, the page flip ioctl gets -ENOMEM from drm_event_reserve_init and propagates it. (That the modesetting driver can't deal with this is its bug)

Options for addressing this:

The kernel might allow queuing page flip events even when the queue is full with vblank events.

The xserver present code should be able to re-use a single DRM vblank event for multiple client requests with the same target MSC.
Comment 8 Michel Dänzer 2018-05-16 07:58:42 UTC
On further thinking, this should be addressed in the xserver present code. Even if the kernel didn't return an error from the page flip ioctl, it still would from the vblank ioctls, which can currently result in some of the client's PresentPixmap requests being executed out of order (which might explain at least some of the intermittent artifacts).


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.