Bug 99715 - Don't print: "Note: Buggy applications may crash, if they do please report to vendor"
Summary: Don't print: "Note: Buggy applications may crash, if they do please report to...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Vulkan/intel (show other bugs)
Version: 13.0
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-02-08 18:35 UTC by Rene Lindsay
Modified: 2017-03-04 19:47 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
testing patch that detects DRI2 (3.33 KB, patch)
2017-02-15 19:35 UTC, programmerjake
Details | Splinter Review
Patch with X11 extension-based heuristic (4.13 KB, patch)
2017-02-18 07:00 UTC, Jason Ekstrand
Details | Splinter Review
fixed patch splitting out message into separate function (5.47 KB, patch)
2017-02-18 07:39 UTC, programmerjake
Details | Splinter Review
removed dri2 checking (4.94 KB, patch)
2017-02-18 17:58 UTC, programmerjake
Details | Splinter Review
fix (5.33 KB, patch)
2017-02-26 20:35 UTC, programmerjake
Details | Splinter Review

Description Rene Lindsay 2017-02-08 18:35:22 UTC
I'm running into issues when using the Mesa Vulkan driver on Ubuntu 16.04.
When I enumerate the physical devices, Vulkan returns 2 options for me:
1: NVidia GTX 660 Ti
2: Intel HD Graphics 530 (Skylake GT2)

(On some machines, the order is reversed.)

Next, I determine which GPU (VkPhysicalDevice) the desktop is currently running on, by searching for presentable queue-families on each GPU, using either:
   vkGetPhysicalDeviceSurfaceSupportKHR(), or
   vkGetPhysicalDeviceXcbPresentationSupportKHR().

I HAVE enabled DRI3 support for Intel (in: /etc/X11/xorg.conf.d/20-intel.conf)
so when the desktop is running on Intel, the query returns "True" for the Intel queue.  And, when I use nvidia-prime, to switch the desktop to run on the NVidia GPU,  NVidia's first queue-family returns true, and Intel's queue-family returns false.  So far so good...

The problem is, Mesa then also prints an error message to stderr:

"vulkan: No DRI3 support detected - required for presentation"
"Note: Buggy applications may crash, if they do please report to vendor"

(See line 268 in file: /mesa/src/vulkan/wsi/wsi_common_x11.c)
As I said, I DO have DRI3 enabled, but it seems Mesa doesn't detect it when the NVidia GPU is active.
Besides, it should be perfectly valid to use these query functions to search for presentable queues, without risking an error message being displayed.  I do see the value of informing the user if DRI3 was not set up correctly, but only when its valid.  Mesa should not accuse the app of being buggy, just because a non-Intel GPU is being used.  

Which brings up the next issue. The second line: "Buggy applications may crash" implies that the Vulkan application is buggy, even when its not.  This leads to users incorrectly filing bug-reports against Vulkan app developers, which is a bit unfair: ( see: https://github.com/SaschaWillems/Vulkan/issues/226 )

At least, can we remove the second line from that error message?
I don't want users thinking that my apps are buggy too.
Comment 1 Jason Ekstrand 2017-02-08 23:49:16 UTC
From our e-mail discussion:

> This is an interesting question.  There is a fairly good historical reason
> why we have that message.  For a while after DRI3 first came out, DRI3
> support in the X server and in some of the device-specific back-ends was
> buggy from time to time and a lot of users turned it off.  Even though it's
> been stable for a year or two now, there are still a number of users who
> have it explicitly disabled for one reason or another.
>
> Since the Vulkan WSI model doesn't work with DRI2, we have to return no
> presentation support on those platforms.  When that happens, the user's app
> will fail to start or even crash with no obvious explanation as to why.
> The warning exists to tell those users to go check their xorg.conf file to
> make sure they have DRI3.  This warning message, I believe, has saved us
> from a lot of bug reports of "application X crashes on your driver" with
> other people saying "works for me".
>
> Now, on a reasonably standard gaming setup with an i-series Intel CPU with
> integrated graphics and a discrete AMD or NVIDIA card and where the X
> server is running directly on the discrete card, this falls apart a bit.
> In that case, we can't do WSI but that's ok because you can just run on the
> discrete card.  I'm not quite sure what to do in this case.  We could,
> potentially, add a better heuristic to try and detect if we're running on a
> DDX that would never support DRI3 (such as the NVIDIA one) and only print
> the error in that case.

I think we probably need some sort of heuristic for determining whether the problem is just that we're running against an incompatible DDX or if it's an otherwise compatible DDX and they just don't have DRI3 enabled.  I'm not 100% sure how we should do that.  Maybe just detect DRI2 but not DRI3?
Comment 2 Rene Lindsay 2017-02-09 18:39:29 UTC
 > Maybe just detect DRI2 but not DRI3?

That might work, assuming discreet GPU's don't show up as running in DRI2 mode?
Do you know of a way to do that?

As I said, I enable DRI3 using the "/etc/X11/xorg.conf.d/20-intel.conf" file, and I use nvidia-prime to switch between NVIDIA and Intel GPU's.

This works fine on my Desktop PC, although I also have to move the HDMI cable to a different socket when switching GPUs.

However, on my laptop, the proprietary NVIDIA drivers will not work when DRI3 is enabled, and I just get a black screen. If I delete the "/etc/X11/xorg.conf.d/20-intel.conf" file, to disable DRI3, the NVIDIA drivers work fine again.

So, my worry is, will Mesa detect DRI2 when running on NVIDIA?
If you have some code for detecting DRI2, I'll be happy to test it on my setup.
Comment 3 Rene Lindsay 2017-02-09 20:55:25 UTC
...As for the error message, as Jason said, the historical reason for printing: 
"No DRI3 support", is to inform the USER that he hasn't configured his system correctly, and he should please go enable DRI3. 

However, following it with "Buggy applications may crash", implies that its the application DEVELOPER's fault for not adding DRI3, which is misleading and unfair.

Actually, the USER should enable DRI3 himself, and NOT go "report to vendor".
Comment 4 programmerjake 2017-02-15 19:35:47 UTC
Created attachment 129656 [details] [review]
testing patch that detects DRI2

I added a patch that detects DRI2. Would you please test it, Rene Lindsay?
Also, is there a good url to send a user to to show them how to enable DRI3?
Comment 5 Rene Lindsay 2017-02-15 23:52:45 UTC
Hi programmerjake,

Thank you for the patch.  I do like the new wording on the Note:

vulkan: No DRI3 support detected - required for presentation
Note: DRI2 support detected, you can probably enable DRI3 in /etc/X11/xorg.conf

The new message is much more helpful, thank you.  :)

I applied the patch, and tested on my work-PC (Intel Skylake + NVidia 660 Ti), 
but unfortunately, I don't think the DRI2 detection is working as intended.

When I run the desktop on Intel Mesa, my Vulkan test app runs with no error messages, but when I switch to proprietary NVidia drivers, it still prints the error. (Both lines)  It looks like the code was supposed to print only the first line when using the proprietary NVidia driver?

It seems NVidia shows up as running on DRI2.

Btw. I enable DRI3 by adding a file, called 20-intel.conf to folder /etc/X11/xorg.conf.d, which contains the following text:

Section "Device"
        Identifier  "Intel Graphics"
        Driver      "intel"
        Option      "AccelMethod"  "sna"
        Option      "DRI" "3"
EndSection

The /etc/X11/xorg.conf file seems to get auto-regenerated regularly, with NVidia-specific settings. Adding the DRI3 option there, seems to have no effect.

Here's a good URL on how to enable DRI3:
http://askubuntu.com/questions/817226/how-to-enable-dri3-on-ubuntu-16-04
Comment 6 programmerjake 2017-02-16 04:42:50 UTC
I found the NV-CONTROL and ATIFGLRXDRI extensions for detecting if the x server is using the proprietary driver. Does anyone know if you can use radv with amd catalyst or should I add code to check for both extensions?
Comment 7 Emil Velikov 2017-02-16 18:48:19 UTC
(In reply to Rene Lindsay from comment #3)
> ...As for the error message, as Jason said, the historical reason for
> printing: 
> "No DRI3 support", is to inform the USER that he hasn't configured his
> system correctly, and he should please go enable DRI3. 
> 
> However, following it with "Buggy applications may crash", implies that its
> the application DEVELOPER's fault for not adding DRI3, which is misleading
> and unfair.
> 
> Actually, the USER should enable DRI3 himself, and NOT go "report to vendor".

Rene,

Please note that _nothing_ is implied by the message. It's straight out facts.
If you think the message is misleading, please send a patch to improve it.

As mentioned in our email chat, what do other drivers when their prerequisites are not met ? Using a similar approach in mesa would be better imho.

On the topic of detecting the first [working] implementation I'll repeat my earlier decision - shouldn't this information be provided by the Vulkan layers ?

Or at least having a common way _fully documented as part of the spec_ to do it on all platforms/drivers is what we want here, right ?
Probing the system for NV-CONTROL/other brings vendor specific 'hacks' which I afaict was against the ideas behind Vulkan ?
Comment 8 Emil Velikov 2017-02-16 18:50:49 UTC
Pardon, should have started with a disclaimer:
I'm not a Vulkan expert, nor an authoritative person wrt the ANV/RADV drivers.
Comment 9 programmerjake 2017-02-16 19:02:20 UTC
Emil, there is a standard method to detect if a particular driver will work. See vkGetPhysicalDeviceSurfaceSupportKHR. Detecting those vendor specific extensions was meant to detect when the user would not be able to enable DRI3 in the xorg.conf file. they are going to be used solely to tell the user what to do about the absence of the DRI3 extension. Detecting if the vulkan driver works isn't affected by the presence or absence of vendor specific extensions, only by DRI3 and Present, which are required to implement the swapchain.
Comment 10 Emil Velikov 2017-02-17 14:21:46 UTC
Seems that out of 2-3 things mentioned (some of which perhaps reasonable?) people focus only one thing. Not sure I can be of any help in that case :-\
Comment 11 Rene Lindsay 2017-02-18 01:21:25 UTC
Emil, to answer your question: 

vkGetPhysicalDeviceSurfaceSupportKHR() is there to query / check if a GPU can present. If it can't, the query returns false.  
(See: http://vulkan-spec-chunked.ahcox.com/ch29s04.html)
Its not an error if that happens, so technically, it shouldn't print ANY error messages, which is what other vendors do.  

However, the "No DRI3" hint IS helpful, especially with the new wording.
It just needs better heuristics, so its only displayed when applicable.
(i.e. NOT when using another Vulkan driver.)

As for your decision:

Validation layers are for DEVELOPERS, to debug their code.
End-users should not have validation layers installed.
However, DRI3 must be enabled by the end-user, in his xorg.conf file.
So, if the "No DRI3" message was displayed only by Validation layers,
the intended audience (end-users) would never see it.
Comment 12 Jason Ekstrand 2017-02-18 07:00:05 UTC
Created attachment 129724 [details] [review]
Patch with X11 extension-based heuristic

(In reply to programmerjake from comment #6)
> I found the NV-CONTROL and ATIFGLRXDRI extensions for detecting if the x
> server is using the proprietary driver. Does anyone know if you can use radv
> with amd catalyst or should I add code to check for both extensions?

I think checking for those two extensions and *not* warning if one of them is found sounds like a reasonable thing to do.  I have no idea if radv will run on the proprietary AMD X driver but I doubt anyone cares.  If you've installed the AMD proprietary X driver, you've probably also installed the proprietary Vulkan driver to go with it.  I've attached a patch to do just that.  Unfortunately, I lack the systems to test it beyond making sure Intel still works.
Comment 13 Jason Ekstrand 2017-02-18 07:00:15 UTC
(In reply to Emil Velikov from comment #7)
> Or at least having a common way _fully documented as part of the spec_ to do
> it on all platforms/drivers is what we want here, right ?
> Probing the system for NV-CONTROL/other brings vendor specific 'hacks' which
> I afaict was against the ideas behind Vulkan ?

I think you're a bit out of your depth here.  There is a fully documented way to detect whether or not you can use WSI on the given platform.  It's to call exactly the function that has the warning in it.  The problem is that the warning can "spuriously" return false in anv or radv if DRI3 is not enabled and the application will fail to start or just crash.  I put "spuriously" in quotes because returning false really is true in that case because you can't use Vulkan WSI with DRI2.

What we're talking about here is not heuristics for determining whether or not Vulkan WSI is supported (that's simple, look for DRI3 and a GPU match).  We're talking about heuristics for when to emit a warning message.  I think checking for a couple of vendor extensions is a perfectly valid way to do that.
Comment 14 programmerjake 2017-02-18 07:39:53 UTC
Created attachment 129725 [details] [review]
fixed patch splitting out message into separate function

split the message out into a separate function to reduce code duplication and extended the message, including a link to a Ask Ubuntu question showing how to enable DRI3. Also fixed the arguments to xcb_query_extension, the second argument is the length of the extension name.

Mostly unrelated, I'd like to be the one to end up submitting the patch for this bug as I'm trying to qualify for Google summer of code to implement something like llvmpipe for Vulkan.
Comment 15 Emil Velikov 2017-02-18 14:25:19 UTC
Rene, Jason, all, thanks for the patience. Yes I was very silly of me to suggest Vulkan layers.

With the risk of sounding like a complete nob I'll repeat my very first question/suggestion.

How do other vendors deal with unmet requirements ? For example:
 - when the NV-CONTROL/other extension is missing
 - version mismatch for the extension and/or libraries present on the system
 - the xorg.conf file setting are 'wrong'


As implied earlier, imho adding heuristics around NV-CONTROL/others is [sort of] broken, since:
 - these interfaces and can be implemented by others
 - it makes future interop between AMDGPU-PRO and RADV rough
 - does not scale - you can plug a AMD card in a ARM (Mali/Qcom/other) board


Hopefully that is nice food for thought, I won't bother you any more.
Comment 16 programmerjake 2017-02-18 17:58:41 UTC
Created attachment 129733 [details] [review]
removed dri2 checking
Comment 17 Jason Ekstrand 2017-02-18 19:06:32 UTC
(In reply to Emil Velikov from comment #15)
> How do other vendors deal with unmet requirements ? For example:
>  - when the NV-CONTROL/other extension is missing
>  - version mismatch for the extension and/or libraries present on the system
>  - the xorg.conf file setting are 'wrong'

They don't have to.  NVIDIA at least has the userspace driver, kernel driver, and DDX version-locked so it's impossible to have any mismatches.  With AMD, I think it's a bit looser, but whatever WSI mechanism they're using for GL, they're also using for Vulkan.  They probably left DRI2 behind a long time ago and don't really have this problem.

> As implied earlier, imho adding heuristics around NV-CONTROL/others is [sort
> of] broken, since:
>  - these interfaces and can be implemented by others
>  - it makes future interop between AMDGPU-PRO and RADV rough
>  - does not scale - you can plug a AMD card in a ARM (Mali/Qcom/other) board

Yes, which is why it's good that it only controls a warning and no actual driver behavior.
Comment 18 Jason Ekstrand 2017-02-18 19:10:37 UTC
(In reply to programmerjake from comment #16)
> Created attachment 129733 [details] [review] [review]
> removed dri2 checking

Have you tested the patch against both NVIDIA and AMD proprietary X drivers?

(In reply to programmerjake from comment #14)
> Mostly unrelated, I'd like to be the one to end up submitting the patch for
> this bug as I'm trying to qualify for Google summer of code to implement
> something like llvmpipe for Vulkan.

I don't really care that much one way or the other.  I have plenty of patches in mesa with my name on them.  If you're willing to do the legwork to test it thoroughly, you're welcome to it.

That said, if you really want open-source project cred for your GSOC application, my recommendation would be to run piglit or dEQP against llvmpipe and try and fix some rendering errors.  That would both demonstrate a bit more applicable knowledge and get you some experience in llvmpipe which will give you a leg up on the project.  Just a thought. :-)
Comment 19 programmerjake 2017-02-26 20:35:00 UTC
Created attachment 129935 [details] [review]
fix

I have a new patch, but I don't have a system that has both a proprietary driver and an intel gpu. Rene, would you be willing to test this?
I've tested it on an intel only system and it works fine with and without DRI3.
Comment 20 Rene Lindsay 2017-03-01 04:04:47 UTC
Hi Jake,

I pulled master, and applied your patch. 
I tested on NVidia, as well as Intel with and without DRI3.
The only minor issue is the extra ");" at the end of the first line of this printf statement, which causes a compile error:

fprintf(stderr, "vulkan: No DRI3 support detected - required for presentation\n");
                "Note: you can probably enable DRI3 in your Xorg config\n");


Other than that little typo, it all works great! :)
Great work. How soon can this patch be applied to master?


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.