Summary: | Segmentation fault when using vulkaninfo on Radeon | ||
---|---|---|---|
Product: | Mesa | Reporter: | Kenneth Endfinger <kaendfinger> |
Component: | Drivers/Vulkan/Common | Assignee: | mesa-dev |
Status: | RESOLVED NOTOURBUG | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | airlied, chadversary, daniel, danylo.piliaiev, denys.kostin, jason |
Version: | 19.0 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | GDB session when running vulkaninfo. |
I can't reproduce but it crashes inside the WSI code path. I am also running an AMD eGPU over ThunderBolt: Section "Device" Identifier "AMD" Driver "amdgpu" BusID "PCI:61:0:0" Option "AllowEmptyInitialConfiguration" Option "AllowExternalGpus" EndSection Section "Device" Identifier "Intel" Driver "intel" BusID "PCI:0:2:0" EndSection kendfinger@melt ~ $ sudo lspci | grep "VGA" 00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 630 (Mobile) 01:00.0 VGA compatible controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev a1) 3d:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev ef) kendfinger@melt ~ $ xrandr --listproviders Providers: number : 2 Provider 0: id: 0xc1 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 6 outputs: 5 associated providers: 1 name:Radeon RX 570 Series @ pci:0000:3d:00.0 Provider 1: id: 0x45 cap: 0xb, Source Output, Sink Output, Sink Offload crtcs: 4 outputs: 2 associated providers: 1 name:Intel hi, I am able to reproduce this issue on Manjaro OS and intel (CFL CPU), with system (18.3.4) and built from git mesa versions. Can I provide some additional information for you to help in debugging? Core was generated by `vulkaninfo'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fa3451b50f7 in XGetXCBConnection () from /usr/lib/libX11-xcb.so.1 [Current thread is 1 (Thread 0x7fa34945c740 (LWP 7380))] (gdb) bt #0 0x00007fa3451b50f7 in XGetXCBConnection () from /usr/lib/libX11-xcb.so.1 #1 0x00007fa342a267c1 in ?? () from /usr/lib/libvulkan_intel.so #2 0x0000561d0b6e0693 in ?? () #3 0x0000561d0b6d5f72 in ?? () #4 0x00007fa349852223 in __libc_start_main () from /usr/lib/libc.so.6 #5 0x0000561d0b6d67be in ?? () that's my core dump. looks like I made a bisect for this issue. Jason, could you please take a look to it? It shows your commit. [manjaro@manjaro-pc mesa]$ git bisect good 68df93ecbcee6215ac49e0c6f62ae818d2bc9962 is the first bad commit commit 68df93ecbcee6215ac49e0c6f62ae818d2bc9962 Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Thu Sep 21 13:54:55 2017 -0700 anv: Trivially implement VK_KHR_device_group Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> :040000 040000 a48855544644df9cb612163b98c96ac3b53b78d4 98add0f3169c5897d9e47116565e992813312109 M src Can you reproduce with a full debug build in gdb and run "bt all"? Is this what you are looking for? (gdb) thread apply all backtrace Thread 2 (Thread 0x7ffff38f8700 (LWP 8010)): #0 0x00007ffff7ba7afc in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0 #1 0x00007ffff5669e24 in cnd_wait (mtx=0x5555556f8208, cond=0x5555556f8230) at ../mesa-19.0.1/src/../include/c11/threads_posix.h:155 #2 util_queue_thread_func (input=input@entry=0x555555723f70) at ../mesa-19.0.1/src/util/u_queue.c:270 #3 0x00007ffff5669b48 in impl_thrd_routine (p=<optimized out>) at ../mesa-19.0.1/src/../include/c11/threads_posix.h:87 #4 0x00007ffff7ba1a9d in start_thread () from /usr/lib/libpthread.so.0 #5 0x00007ffff7cdfb23 in clone () from /usr/lib/libc.so.6 Thread 1 (Thread 0x7ffff7a10740 (LWP 8004)): #0 0x00007ffff54300f7 in XGetXCBConnection () from /usr/lib/libX11-xcb.so.1 #1 0x00007ffff565c961 in x11_surface_get_connection (icd_surface=0x555555a19510) at ../mesa-19.0.1/src/vulkan/wsi/wsi_common_x11.c:404 #2 x11_surface_get_connection (icd_surface=0x555555a19510) at ../mesa-19.0.1/src/vulkan/wsi/wsi_common_x11.c:401 #3 x11_surface_get_support (icd_surface=0x555555a19510, wsi_device=0x5555557428f0, queueFamilyIndex=<optimized out>, pSupported=0x7fffffffdef4) at ../mesa-19.0.1/src/vulkan/wsi/wsi_common_x11.c:424 #4 0x00005555555626e3 in AppGpuDumpQueueProps (out=0x7ffff7da35c0 <_IO_2_1_stdout_>, id=0, gpu=0x555555a1b190) at /usr/src/debug/Vulkan-Tools-1.1.101/vulkaninfo/vulkaninfo.c:4461 #5 AppGpuDump (gpu=0x555555a1b190, out=0x7ffff7da35c0 <_IO_2_1_stdout_>) at /usr/src/debug/Vulkan-Tools-1.1.101/vulkaninfo/vulkaninfo.c:4764 #6 0x0000555555557f24 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/Vulkan-Tools-1.1.101/vulkaninfo/vulkaninfo.c:5328 Oddly enough, with vulkan-tools 1.1.102: /build/vulkan-tools/src/Vulkan-Tools-1.1.102/vulkaninfo/vulkaninfo.c:4504: failed with VK_ERROR_OUT_OF_HOST_MEMORY is now the error... >/build/vulkan-tools/src/Vulkan-Tools-1.1.102/vulkaninfo/vulkaninfo.c:4504: failed with VK_ERROR_OUT_OF_HOST_MEMORY Actually that's exactly what I got in "normal" version. I discussed this error with dev and he said that it couldn't be very critical. So during bisection my "good" result - was that error, and "bad" - sigfault >Can you reproduce with a full debug build in gdb and run "bt all"? sorry for long response. Is this still actual or Kenneth gave needed log? bt all didn't provide anything, so I did same with Kenneth. Output below: (gdb) bt all No symbol "all" in current context. (gdb) thread apply all backtrace Thread 2 (Thread 0x7ffff6c37700 (LWP 11457)): #0 0x00007ffff7bc2afc in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0 #1 0x00007ffff771d5c7 in cnd_wait (cond=0x55555584be30, mtx=0x55555584be08) at ../include/c11/threads_posix.h:155 #2 0x00007ffff771e0a6 in util_queue_thread_func (input=0x55555557ffd0) at ../src/util/u_queue.c:272 #3 0x00007ffff771d3f8 in impl_thrd_routine (p=0x555555658040) at ../include/c11/threads_posix.h:87 #4 0x00007ffff7bbca9d in start_thread () from /usr/lib/libpthread.so.0 #5 0x00007ffff7cfab23 in clone () from /usr/lib/libc.so.6 Thread 1 (Thread 0x7ffff7a2b740 (LWP 11453)): #0 0x00007ffff7f780ce in xcb_send_request_with_fds64 () from /usr/lib/libxcb.so.1 #1 0x00007ffff7f7866a in xcb_send_request () from /usr/lib/libxcb.so.1 #2 0x00007ffff7f87405 in xcb_query_extension () from /usr/lib/libxcb.so.1 #3 0x00007ffff770a3ce in wsi_x11_connection_create (wsi_dev=0x555555600510, conn=0xa0ec8148e5894855) at ../src/vulkan/wsi/wsi_common_x11.c:135 #4 0x00007ffff770a782 in wsi_x11_get_connection (wsi_dev=0x555555600510, conn=0xa0ec8148e5894855) at ../src/vulkan/wsi/wsi_common_x11.c:242 #5 0x00007ffff770ac90 in x11_surface_get_support (icd_surface=0x55555584d3a0, wsi_device=0x555555600510, queueFamilyIndex=0, pSupported=0x7fffffffd8d4) at ../src/vulkan/wsi/wsi_common_x11.c:428 #6 0x00007ffff7709086 in wsi_common_get_surface_support (wsi_device=0x555555600510, queueFamilyIndex=0, _surface=0x55555584d3a0, pSupported=0x7fffffffd8d4) at ../src/vulkan/wsi/wsi_common.c:724 #7 0x00007ffff746074e in anv_GetPhysicalDeviceSurfaceSupportKHR (physicalDevice=0x555555600070, queueFamilyIndex=0, surface=0x55555584d3a0, pSupported=0x7fffffffd8d4) at ../src/intel/vulkan/anv_wsi.c:91 #8 0x0000555555562693 in ?? () #9 0x0000555555557f72 in ?? () #10 0x00007ffff7c23223 in __libc_start_main () from /usr/lib/libc.so.6 #11 0x00005555555587be in ?? () Unfortunately, thanks fo Vulkan passing everything as struct pointers, a backtrack with `bt full` isn't as useful as one would like. That said, given where it's crashing, I'm 93% sure that both of those backtraces are due to the client (vulkaninfo) providing us with a bogus X11 connection/display. I believe I'm running into a related (or the same) error: Arch Linux with an AMD RX580 GPU Mesa 19.0.2-1 vulkan-radeon 19.0.2-1 libxcb 1.13.1-1 xorg-server 1.20.4-1 vulkan-tools 1.1.102-1 Process 3567 (vulkaninfo) of user 1000 dumped core. Stack trace of thread 3567: #0 0x00006efde23374d1 xcb_send_request_with_fds64 (libxcb.so.1) #1 0x00006efde233766a xcb_send_request (libxcb.so.1) #2 0x00006efde2346405 xcb_query_extension (libxcb.so.1) #3 0x00006efde1c273ed n/a (libvulkan_radeon.so) #4 0x00006efde1c27bca n/a (libvulkan_radeon.so) #5 0x00000b7f648c572d n/a (vulkaninfo) #6 0x00000b7f648baf92 n/a (vulkaninfo) #7 0x00006efde1fe2223 __libc_start_main (libc.so.6) #8 0x00000b7f648bb7de n/a (vulkaninfo) Stack trace of thread 3568: #0 0x00006efde1f81afc pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00006efde1c33874 n/a (libvulkan_radeon.so) #2 0x00006efde1c33598 n/a (libvulkan_radeon.so) #3 0x00006efde1f7ba9d start_thread (libpthread.so.0) #4 0x00006efde20b9af3 __clone (libc.so.6) Let me know if you think this is related or if I should open another bug report. Jason you was absolutely right. Test data: GPU Intel HD620 Manjaro OS vulkan-tools 1.1.101-1 - issue reproducable vulkan-tools 1.1.106-1 - issue is not reproducable Can somebody check this on radeon? Or I will try to do this later Yes, now on radeon it doesn't crash with vulkan-tools 1.1.106 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 143799 [details] GDB session when running vulkaninfo. When running vulkaninfo, it exits with a segmentation fault. Running on Arch Linux: mesa 19.0.1-2 vulkan-radeon 19.0.1-2 libxcb 1.13.1-1 xorg-server 1.20.4-1 vulkan-tools 1.1.101-1 Thread 1 "vulkaninfo" received signal SIGSEGV, Segmentation fault. 0x00007ffff543a0f7 in XGetXCBConnection () from /usr/lib/libX11-xcb.so.1 (gdb) info stack #0 0x00007ffff543a0f7 in XGetXCBConnection () from /usr/lib/libX11-xcb.so.1 #1 0x00007ffff4ca58d1 in x11_surface_get_connection (icd_surface=0x555555972610) at ../mesa-19.0.1/src/vulkan/wsi/wsi_common_x11.c:404 #2 x11_surface_get_connection (icd_surface=0x555555972610) at ../mesa-19.0.1/src/vulkan/wsi/wsi_common_x11.c:401 I have attached the full GDB log.