108540 – vkAcquireNextImageKHR blocks when timeout=0 in Wayland

Bug 108540 - vkAcquireNextImageKHR blocks when timeout=0 in Wayland

Summary: vkAcquireNextImageKHR blocks when timeout=0 in Wayland

Status:	RESOLVED MOVED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/Vulkan/Common (show other bugs)
Version:	git
Hardware:	Other All

Importance:	medium normal
Assignee:	Eric Engestrom
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2018-10-24 16:12 UTC by Chad Versace
Modified:	2019-09-18 18:13 UTC (History)
CC List:	5 users (show)

See Also:
i915 platform:
i915 features:

Attachments

Description Chad Versace 2018-10-24 16:12:23 UTC

According to today's (2018-10-24) Khronos telco, vkAcquireNextImageKHR should *not* block when VkAcquireNextImageInfoKHR::timeout == 0.

As of Mesa 4ba445e0117b29c31b030feb6e0f421a5ceb03e5, wsi_wl_swapchain_acquire_next_image() blocks when no images are available, even when timeout == 0. The relevant code is below. Observe that wl_display_roundtrip_queue() is a blocking call with no timeout parameter.

Suggested solution: When timeout = 0, pump the event queue with non-blockin wl_display_dispatch_queue_* and then immediately return VK_NOT_READY.

static VkResult                                                           
wsi_wl_swapchain_acquire_next_image(...)          
{                                                                         
   ...
                                                       
   while (1) {                                                            
      for (uint32_t i = 0; i < chain->base.image_count; i++) {            
         if (!chain->images[i].busy) {                                    
            /* We found a non-busy image */                               
            *image_index = i;                                             
            chain->images[i].busy = true;                                 
            return VK_SUCCESS;                                            
         }                                                                
      }                                                                   
                                                                          
      /* This time we do a blocking dispatch because we can't go          
       * anywhere until we get an event.                                  
       */                                                                 
      int ret = wl_display_roundtrip_queue(chain->display->wl_display,    
                                           chain->display->queue);        
      if (ret < 0)                                                        
         return VK_ERROR_OUT_OF_DATE_KHR;                                 
   }                                                                      
}

Comment 1 Daniel Stone 2018-11-06 11:45:10 UTC

We already do a completely non-blocking wl_display_dispatch_queue_pending() at the top of the function, which pumps events which have already been read from the server's FD and placed in our queue, but not processed.

wl_display_roundtrip_queue() is semi-blocking: we send a sync request to the compositor, which it will immediately return to unblock us. This will involve a wait of exactly one round trip, i.e. it could just be considered a 'far' API/RPC call, much like an ioctl(). If the compositor isn't dead, it will service this promptly.

The reason we pump roundtrips in a tight loop with no images available in ANI, is because most older compositors (Weston before 4.0, Mutter before 3.26.1 aka Fedora 27) would queue but not flush buffer-release events. On these compositors, it was possible for the client to block forever if it was not also receiving surface frame events, as the queue would never flush, unless something like an input event provoked it. I've filed https://gitlab.freedesktop.org/wayland/wayland/issues/62 so we can eventually replace this with just going to sleep.

I've posted https://patchwork.freedesktop.org/patch/259244/ to fix timeout==0, which will just flush pending events and nothing more, though I have some open questions:
* how zero is a zero timeout? are we allowed to read FDs? are we allowed to do a roundtrip?
* if the compositor has sent a buffer-release event but the client has not called wl_display_{dispatch,read_events,roundtrip,...} to read these events from the FD, this patch means ANI with a zero timeout will spin forever: is this OK? if so, should we document somewhere that the client must also force reads from the display FD?

Properly respecting timeouts means we need to write a prepare/poll/read/cancel event loop, which is easily doable but not something I have the time for right now.

Comment 2 Eric Engestrom 2019-01-09 18:35:29 UTC

(In reply to Daniel Stone from comment #1)
>   * how zero is a zero timeout? are we allowed to read FDs? are we allowed
> to do a roundtrip?
This was mentioned on today's WSI call (I forgot by whom):
We are allowed to do roundtrips to the server, and (my interpretation) anything else which takes an amount of time we can reasonably assume to be negligible.

Comment 3 GitLab Migration User 2019-09-18 18:13:12 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/180.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.