Bug 102578 - [Vulkan CTS] crashes in some dEQP-VK.wsi.wayland.* tests
Summary: [Vulkan CTS] crashes in some dEQP-VK.wsi.wayland.* tests
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Vulkan/intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-07 08:07 UTC by Samuel Iglesias Gonsálvez
Modified: 2018-05-11 18:28 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Valgrind output of dEQP-VK.wsi.wayland.incremental_present.scale_none.fifo.incremental_present (20.27 KB, text/plain)
2017-09-07 08:07 UTC, Samuel Iglesias Gonsálvez
Details

Description Samuel Iglesias Gonsálvez 2017-09-07 08:07:37 UTC
Created attachment 134039 [details]
Valgrind output of dEQP-VK.wsi.wayland.incremental_present.scale_none.fifo.incremental_present

Mesa HEAD: 6d9d6071ee961acc82543b321764a0ffec8cd39a
Wayland HEAD: 3ea73cba0446886632cd24442203fda47cb3c220
Linux distro: Debian Testing
Vulkan CTS: any of vulkan-cts-1.0.2.6-rc* branches.

Testing Vulkan CTS, I found segmentation faults on some tests under dEQP-VK.wsi.wayland.* category. Attached is a valgrind output of the execution of one of these tests.

deqp-vk/wsi/wayland/incremental_present/scale_none/fifo/reference:
crash
deqp-
vk/wsi/wayland/incremental_present/scale_none/fifo_relaxed/incremental_
present: crash
deqp-
vk/wsi/wayland/incremental_present/scale_none/fifo_relaxed/reference:
crash
deqp-
vk/wsi/wayland/incremental_present/scale_none/immediate/incremental_pre
sent: crash
deqp-vk/wsi/wayland/incremental_present/scale_none/immediate/reference: 
crash
deqp-
vk/wsi/wayland/incremental_present/scale_none/mailbox/incremental_prese
nt: crash
deqp-vk/wsi/wayland/incremental_present/scale_none/mailbox/reference:
crash
deqp-vk/wsi/wayland/surface/query_formats: crash
deqp-vk/wsi/wayland/surface/query_formats2: crash
deqp-vk/wsi/wayland/swapchain/create/clipped: crash
deqp-vk/wsi/wayland/swapchain/create/composite_alpha: crash
deqp-vk/wsi/wayland/swapchain/create/image_array_layers: crash
deqp-vk/wsi/wayland/swapchain/create/image_extent: crash
deqp-vk/wsi/wayland/swapchain/create/image_format: crash
deqp-vk/wsi/wayland/swapchain/create/image_sharing_mode: crash
deqp-vk/wsi/wayland/swapchain/create/image_usage: crash
deqp-vk/wsi/wayland/swapchain/create/min_image_count: crash
deqp-vk/wsi/wayland/swapchain/create/pre_transform: crash
deqp-vk/wsi/wayland/swapchain/create/present_mode: crash
deqp-vk/wsi/wayland/swapchain/get_images/incomplete: crash
deqp-vk/wsi/wayland/swapchain/modify/resize: crash
deqp-vk/wsi/wayland/swapchain/render/basic: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/clipped: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/composite_alpha: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/image_array_layers: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/image_extent: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/image_format: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/image_sharing_mode: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/image_usage: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/min_image_count: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/pre_transform: crash
deqp-vk/wsi/wayland/swapchain/simulate_oom/present_mode: crash
Comment 1 Jason Ekstrand 2017-09-08 16:47:20 UTC
Did a little digging and this definitely appears to be a mesa bug.  We have a hash table which maps wl_display pointers to a wsi_wl_display struct.  The problem is that we keep the wsi_wl_displays around until the instance is destroyed.  The tests are destroying the wl_display before Vulkan.  This is perfectly valid so long as they don't have any swapchains hanging around.  I see three options:

 1)  Do a bit of reference counting of the wsi_wl_display and destroy it when the last swapchain to use it goes away.  This is a bit sub-optimal because part of the reason why we cache it in the first place is so that we can get surface capabilities and then create the swapchain with as few round-trips as possible.

 2) Implement a non-trivial VkSurface and store this information in it.  We could still end up with extra round-trips but it would provide most of the benefit.  However, this will break our driver on old loaders.  Maybe we don't care?

 3) Just leak the three wl_proxy objects associated with the wsi_wl_display.  If an application is doing a lot of CreateInstance and DestroyInstance, then they're either the CTS or a very broken app.  wl_proxy objects are small and we can probably get away with the leak.

Among those three options I think 2 is probably the best but I'm still a bit undecided.
Comment 2 Jason Ekstrand 2017-09-26 22:58:42 UTC
I just sent patches to the list for method 1.  It's still more round-trips than I'd like but it should work reliably.
Comment 3 Jason Ekstrand 2018-05-11 18:28:44 UTC
This should be fixed as of the following commit:

commit 43691024982b3ea734ad001bd53cc7b563ccce5a
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Tue Sep 26 08:30:22 2017 -0700

    vulkan/wsi/wayland: Stop caching Wayland displays
    
    We originally implemented caching to avoid unneeded round-trips to the
    compositor when querying surface capabilities etc. to set up the
    swapchain.  Unfortunately, this doesn't work if vkDestroyInstance is
    called after the Wayland connection has been dropped.  In this case, we
    end up trying to clean up already destroyed wl_proxy objects which leads
    to crashes.  In particular most of dEQP-VK.wsi.wayland is crashing
    thanks to this problem.
    
    This commit gets rid of the cache and simply embeds the wsi_wl_display
    struct in the swapchain.  While we're at it, we can get rid of the
    wl_event_queue that we were storing in the swapchain because we can just
    use the one in the embedded wsi_wl_display.
    
    Reviewed-by: Daniel Stone <daniels@collabora.com>
    Bugzilla: https://bugs.freedesktop.org/102578
    Cc: mesa-stable@lists.freedesktop.org


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.