Bug 108925 - vkCmdCopyQueryPoolResults(VK_QUERY_RESULT_WAIT_BIT) for timestamps with large query count hangs
Summary: vkCmdCopyQueryPoolResults(VK_QUERY_RESULT_WAIT_BIT) for timestamps with large...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Vulkan/radeon (show other bugs)
Version: git
Hardware: Other All
: medium major
Assignee: mesa-dev
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-03 12:26 UTC by Alex Smith
Modified: 2018-12-05 11:10 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Test case (6.34 KB, application/gzip)
2018-12-03 12:26 UTC, Alex Smith
Details
New test case (6.41 KB, application/gzip)
2018-12-05 09:17 UTC, Alex Smith
Details

Description Alex Smith 2018-12-03 12:26:29 UTC
Created attachment 142700 [details]
Test case

According to the Vulkan spec we need VK_QUERY_RESULT_WAIT_BIT here to ensure that all previous query commands have written a result. Using that on RADV with a large query count causes a GPU hang.

Attached a test case which reproduces this. Hangs for me on a Vega 64, with latest git master and 18.3 branch. Doesn't hang if the query count is reduced (in my testing, 512 queries hangs but 256 doesn't), or if VK_QUERY_RESULT_WAIT_BIT is not used.
Comment 1 Alex Smith 2018-12-03 13:38:00 UTC
FWIW, doesn't occur on AMDVLK. As far as I can see, that is just using a barrier rather than waiting for each individual query value to be available.

Also, it looks like RADV is only waiting on the low 32 bits of the query value. Couldn't you get very unlucky and get a valid timestamp with the low 32 bits as 0xffffffff, which would hang?
Comment 2 Samuel Pitoiset 2018-12-04 13:39:12 UTC
I can confirm this, working on.
Comment 3 Samuel Pitoiset 2018-12-04 13:54:41 UTC
The problem has been introduced by:

commit 5d6a560a2986c9ab421b3c7904d29bb7bc35e36f
Author: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Date:   Tue Sep 25 20:26:58 2018 +0200

    radv: do not use the availability bit for timestamp queries
    
    It's unnecessary because we can just check if the timestamp
    is to different to the default value when a pool is created
    or resetted. Instead of waiting for the availability bit to
    be 1, we have to emit a not equal WAIT_REG_MEM for checking
    if the timestamp is ready.
    
    Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
    Reviewed-by: Dave Airlie <airlied@redhat.com>
Comment 4 Samuel Pitoiset 2018-12-04 15:53:02 UTC
Can you try this patch https://patchwork.freedesktop.org/series/53482/ ?
Comment 5 Alex Smith 2018-12-05 09:17:01 UTC
Created attachment 142732 [details]
New test case


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.