Bug 99987

Summary: Mesa 13+ breaks Xvnc (and similar X servers)
Product: Mesa Reporter: Pierre Ossman <ossman>
Component: Mesa coreAssignee: mesa-dev
Status: RESOLVED NOTOURBUG QA Contact: mesa-dev
Severity: normal    
Priority: medium CC: devurandom, ngaywood
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Pierre Ossman 2017-02-27 14:07:43 UTC
Got the Mesa 13 upgrade on my Fedora 25 and all of a sudden Xvnc and ThinLinc stopped having a functional GLX. They can still load their swrast.so and don't complain to the logs, yet GLX fails to work:

> glxinfo
> name of display: :2
> Error: couldn't find RGB GLX visual or fbconfig

I notices this is where the whole glvnd stuff went active, so that's probably a good first guess as to what is breaking things.
Comment 1 Norman Gaywood 2017-02-28 22:29:38 UTC
Same when running under x2go with Fedora 25. From:
https://bugzilla.redhat.com/show_bug.cgi?id=1427174

x2go also broken with (at least) an xfce4 desktop.

xfce4 session starts, but many programs won't start from the menus.

xterm starts from menus and we get:
> $ glxinfo
> name of display: :50.0
> Error: couldn't find RGB GLX visual or fbconfig

And things like:
> $ xfce4-terminal 
> Segmentation fault (core dumped)

> $ gnome-terminal
> Segmentation fault (core dumped)

For gnome-terminal:

ccpp-2017-02-28-17:16:40-2101 # gdb /usr/bin/gnome-terminal ./coredump
GNU gdb (GDB) Fedora 7.12.1-46.fc25
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/gnome-terminal...Reading symbols from /usr/lib/debug/usr/bin/gnome-terminal.debug...done.
done.
[New LWP 2101]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `gnome-terminal'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fb111ac935f in rawmemchr () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install at-spi2-atk-2.22.0-1.fc25.x86_64 at-spi2-core-2.22.0-1.fc25.x86_64 atk-2.22.0-1.fc25.x86_64 bzip2-libs-1.0.6-21.fc25.x86_64 cairo-1.14.8-1.fc25.x86_64 cairo-gobject-1.14.8-1.fc25.x86_64 dbus-libs-1.11.10-1.fc25.x86_64 dconf-0.26.0-1.fc25.x86_64 expat-2.2.0-1.fc25.x86_64 fontconfig-2.12.1-1.fc25.x86_64 freetype-2.6.5-1.fc25.x86_64 gdk-pixbuf2-2.36.5-1.fc25.x86_64 glib2-2.50.3-1.fc25.x86_64 glibc-2.24-4.fc25.x86_64 gmp-6.1.1-1.fc25.x86_64 gnutls-3.5.9-2.fc25.x86_64 graphite2-1.3.6-1.fc25.x86_64 gtk3-3.22.8-1.fc25.x86_64 harfbuzz-1.3.2-1.fc25.x86_64 libNX_Xinerama-3.5.0.32-4.fc24.x86_64 libX11-1.6.4-4.fc25.x86_64 libXau-1.0.8-6.fc24.x86_64 libXcomposite-0.4.4-8.fc24.x86_64 libXcursor-1.1.14-6.fc24.x86_64 libXdamage-1.1.4-8.fc24.x86_64 libXext-1.3.3-4.fc24.x86_64 libXfixes-5.0.3-1.fc25.x86_64 libXi-1.7.9-1.fc25.x86_64 libXrandr-1.5.1-1.fc25.x86_64 libXrender-0.9.10-1.fc25.x86_64 libblkid-2.28.2-2.fc25.x86_64 libcap-2.25-2.fc25.x86_64 libdatrie-0.2.9-3.fc25.x86_64 libepoxy-1.3.1-3.fc25.x86_64 libffi-3.1-9.fc24.x86_64 libgcc-6.3.1-1.fc25.x86_64 libgcrypt-1.6.6-1.fc25.x86_64 libglvnd-0.2.999-10.gitdc16f8c.fc25.x86_64 libglvnd-egl-0.2.999-10.gitdc16f8c.fc25.x86_64 libglvnd-glx-0.2.999-10.gitdc16f8c.fc25.x86_64 libgpg-error-1.24-1.fc25.x86_64 libidn2-0.16-1.fc25.x86_64 libmount-2.28.2-2.fc25.x86_64 libpng-1.6.27-1.fc25.x86_64 libselinux-2.5-13.fc25.x86_64 libstdc++-6.3.1-1.fc25.x86_64 libtasn1-4.10-1.fc25.x86_64 libthai-0.1.25-1.fc25.x86_64 libunistring-0.9.4-3.fc24.x86_64 libuuid-2.28.2-2.fc25.x86_64 libwayland-client-1.12.0-1.fc25.x86_64 libwayland-cursor-1.12.0-1.fc25.x86_64 libxcb-1.12-1.fc25.x86_64 libxkbcommon-0.7.1-1.fc25.x86_64 lz4-1.7.5-1.fc25.x86_64 mesa-libwayland-egl-13.0.4-1.fc25.x86_64 nettle-3.3-1.fc25.x86_64 p11-kit-0.23.2-2.fc24.x86_64 pango-1.40.3-1.fc25.x86_64 pcre-8.40-4.fc25.x86_64 pcre2-10.23-1.fc25.x86_64 pixman-0.34.0-2.fc24.x86_64 systemd-libs-231-14.fc25.x86_64 vte291-0.46.1-1.fc25.x86_64 xz-libs-5.2.2-2.fc24.x86_64 zlib-1.2.8-10.fc24.x86_64
(gdb) where
#0  0x00007fb111ac935f in rawmemchr () at /lib64/libc.so.6
#1  0x00007fb111ab1832 in _IO_str_init_static_internal () at /lib64/libc.so.6
#2  0x00007fb111a9ecc7 in __isoc99_vsscanf () at /lib64/libc.so.6
#3  0x00007fb111a9ec67 in __isoc99_sscanf () at /lib64/libc.so.6
#4  0x00007fb10f6388e2 in epoxy_glx_version () at /lib64/libepoxy.so.0
#5  0x00007fb1143321e9 in gdk_x11_screen_init_gl () at /lib64/libgdk-3.so.0
#6  0x00007fb11433259a in _gdk_x11_screen_update_visuals_for_gl () at /lib64/libgdk-3.so.0
#7  0x00007fb11433b1f6 in _gdk_x11_screen_init_visuals () at /lib64/libgdk-3.so.0
#8  0x00007fb114338230 in _gdk_x11_screen_new () at /lib64/libgdk-3.so.0
#9  0x00007fb1143280c8 in _gdk_x11_display_open () at /lib64/libgdk-3.so.0
#10 0x00007fb1142fcb85 in gdk_display_manager_open_display () at /lib64/libgdk-3.so.0
#11 0x00007fb1147e6f26 in post_parse_hook () at /lib64/libgtk-3.so.0
#12 0x00007fb112a422d8 in g_option_context_parse () at /lib64/libglib-2.0.so.0
#13 0x000055943ba35834 in terminal_options_parse (working_directory=<optimized out>, startup_id=<optimized out>, argcp=0x7fff9a4ec93c, argvp=0x7fff9a4ec930, error=0x7fff9a4ec940) at terminal-options.c:868
#14 0x000055943ba3237a in main (argc=<optimized out>, argv=<optimized out>) at terminal.c:375
(gdb)
Comment 2 Emil Velikov 2017-03-01 01:51:38 UTC
I would suggest building mesa w/o the glvnd stuff and ensuring the mesa libGL/friends are picked.

That aside it seems similar to bug 99027 ? Admittedly I've got limited experience with !Xorg so if you think any of my analysis is off please shout.
Comment 3 Pierre Ossman 2017-03-01 12:17:36 UTC
Bug 99027 looks like it's the same thing, but I don't agree with NOTOURBUG. The segfault might be GTK+'s fault, but the fact that Mesa 12 was quite content with Xvfb/Xvnc/x2go and Mesa 13 isn't means more justification is needed as to why it isn't an issue with Mesa.
Comment 4 Emil Velikov 2017-03-01 12:56:17 UTC
(In reply to Pierre Ossman from comment #3)
> Bug 99027 looks like it's the same thing, but I don't agree with NOTOURBUG.
> The segfault might be GTK+'s fault, but the fact that Mesa 12 was quite
> content with Xvfb/Xvnc/x2go and Mesa 13 isn't means more justification is
> needed as to why it isn't an issue with Mesa.

Feel free to follow/repeat the debugging and confirm if things are/were broken forever and nobody noticed or there's something subtle that I've missed. I won't be able to take a look in the next few days at least :-\
Comment 5 Pierre Ossman 2017-03-03 08:34:41 UTC
The analysis on bug 99027 seems to be only about why it is crashing, and not why Mesa has changed its requirements on the X server. There is some talk about 8 bit depth, but the issue occurs on standard bit depths as well.

Digging further is also on my todo list, but unfortunately not near the top.
Comment 6 Norman Gaywood 2017-03-09 06:11:55 UTC
On Fedora under x2go, updated to libepoxy-1.4.1-1.fc25.x86_64 and the segfaults have gone away in xfce4-terminal and gnome-terminal

However, with mesa-libGL-13.0.4-1.fc25.x86_64, glxinfo gives:

$ glxinfo
name of display: :50.0
Error: couldn't find RGB GLX visual or fbconfig

With the previous mesa-libGL-12.0.3-3.fc25.x86_64, glxinfo gives:

name of display: :51.0
display: :51  screen: 0
direct rendering: Yes
server glx vendor string: SGI
server glx version string: 1.2
server glx extensions:
    GLX_ARB_multisample, GLX_EXT_import_context, GLX_EXT_visual_info, 
    GLX_EXT_visual_rating, GLX_OML_swap_method, GLX_SGIS_multisample, 
    GLX_SGIX_fbconfig, GLX_SGIX_hyperpipe, GLX_SGIX_swap_barrier, 
    GLX_SGI_make_current_read
client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
client glx extensions:
[snip]
Comment 7 Samuel Mannehed 2017-03-09 13:41:09 UTC
I have the same issue as the people above using Fedora 25. Upgrading libepoxy solves the segmentation fault issues, but desktops or applications that need OpenGL still do not work. Examples of desktops that doesn't work are:

 * GNOME Shell
 * GNOME Classic
 * Cinnamon
 * KDE

Examples of applications that do not work are:

 * totem
 * glxgears
Comment 8 Samuel Mannehed 2017-03-09 13:42:21 UTC
To clarify, I have the above issues while using ThinLinc.
Comment 9 Samuel Mannehed 2017-03-09 13:53:06 UTC
Seeing as Ubuntu 17.04 also will have Mesa 13, I installed their nightly build to test. My tests show that the problem does not exist on Ubuntu 17.04. Applications such as glxgears work fine in ThinLinc, and it does seem like OpenGL is properly available.

Briefly looking at the differences between the Fedora installation and the Ubuntu one shows that GLVND isn't used in the latter. This would further indicate that the problem lies in GLVND?
Comment 10 Hans de Goede 2017-03-13 12:32:48 UTC
I cannot reproduce this I've tried both:

Xvnc :1 -rfbauth /home/hans/.vnc/passwd

And then started an xterm on DISPLAY=:1 and in that xterm run glxgears as well as glxinfo fine using lvmpipe.

As well as running "vncserver" and then starting a xterm + glxgears as well as glxinfo fine inside the session running there.
Comment 11 Pierre Ossman 2017-03-13 12:42:24 UTC
I can confirm that it works fine with Fedora's Xvnc. However it does not work with my own build of TigerVNC:

> $ ./builddir/x86_64/unix/xserver/hw/vnc/Xvnc -ac -SecurityTypes=None :2
> 
> Xvnc TigerVNC 1.7.80 - built Feb 15 2017 15:37:01
> Copyright (C) 1999-2016 TigerVNC Team and many others (see README.txt)
> See http://www.tigervnc.org for information on TigerVNC.
> Underlying X server release 11400000, The X.Org Foundation
> ...

> $ DISPLAY=:2 glxgears
> Error: couldn't get an RGB, Double-buffered visual

There are two major differences between them:

 a) The version of Mesa they are built against

 b) The version of Xorg they are based on
Comment 12 Pierre Ossman 2017-03-13 13:47:06 UTC
I've made a new build here with xorg-server 1.19.1 and Mesa 9.2.5 (the oldest that xorg-server 1.19 will allow), and that makes glxgears work fine. So GLVND requires something newer than 1.14 it seems.
Comment 13 Hans de Goede 2017-03-14 08:10:45 UTC
At least x2go and presumably also thinlinc can be fixed by doing:

sudo ln -s /usr/lib64/libGLX_mesa.so.0 /usr/lib64/libGLX_indirect.so.0 

At least on a 64 bit Fedora, on 32 bit Fedora / other distros you will need to adjust the lib path.

Pierre, if you can confirm that that fixes things, then I think we can close this bug.
Comment 14 Pierre Ossman 2017-03-14 15:46:40 UTC
That does make things work. But I'm hesitant to consider the issue closed though. There are some outstanding questions:

a) I'm still getting direct rendering, despite the file being called libGLX_indirect.so.0. Is the naming more an historical glitch and it should be read as libGLX_fallback.so.0?

b) If it is indeed a fallback, why is that needed? I.e. why isn't Mesa being chosen? It does not seem robust and future proof to rely on a fallback mechanism. Which leads to...

c) What happens when both Mesa and NVIDIA wants to install the fallback symlink? It sounds like things will just stop working, getting us right back to the situation that GLVND was supposed to solve.
Comment 15 Dennis Schridde 2017-03-14 17:38:31 UTC
(In reply to Pierre Ossman from comment #14)
> b) If it is indeed a fallback, why is that needed? I.e. why isn't Mesa being
> chosen? It does not seem robust and future proof to rely on a fallback
> mechanism. Which leads to...

Because libglvnd tries to be vendor neutral (i.e. not tied to either Mesa nor Nvidia) and there might be other vendors providing a software / indirect rendering implementation. Maybe not now and on current Linux distributions, but e.g. on another OS or in the future.

> c) What happens when both Mesa and NVIDIA wants to install the fallback
> symlink? It sounds like things will just stop working, getting us right back
> to the situation that GLVND was supposed to solve.

I was wondering about the same thing. The env-var solution actually seems superior to having a file collision between different packages. Even a config file might be better than a symlink.

But Hans is already working on this: https://bugzilla.redhat.com/show_bug.cgi?id=1413579#57
> I will prepare an update to add this to the Fedora mesa pkgs, but I
> need to coordinate this with the rpmfusion packages, since those currently add
> such a symlink to the nvidia libGLX.
Comment 16 Pierre Ossman 2017-03-15 11:59:22 UTC
I'm afraid that didn't really make things clearer for me. Should I read your answer as "Yes, libGLX_fallback.so is a fallback mechanism"?

And it didn't explain why it is needed?
Comment 17 Pierre Ossman 2017-03-15 12:08:34 UTC
So I found the code that handles this:

https://github.com/NVIDIA/libglvnd/blob/470fc824a38521a52707c6c0f59d827aa5e0f45a/src/GLX/libglxmapping.c#L519-L600

And "indirect" is indeed a poorly named fallback.

It also reveals that the required magic is not the GLX version, but rather a GLX extension. Which unfortunately wasn't added until xorg-server 1.19. So all but the latest fails to support this new mechanism.

In that case, is there any compelling reason why the fallback shouldn't always be mesa?
Comment 18 Pierre Ossman 2017-03-15 12:16:39 UTC
For reference, NVIDIA claims that you need to have libGLX_indirect.so.0 pointing at their driver:

https://devtalk.nvidia.com/default/topic/915640/multiple-glx-client-libraries-in-the-nvidia-linux-driver-installer-package/

So I suspect it will be common to fight for ownership of this symlink.
Comment 19 Dennis Schridde 2017-03-15 16:55:57 UTC
(In reply to Pierre Ossman from comment #16)
> And it didn't explain why it is needed?

If it is not clear from the comments on this bug, please read the Red Hat bug I referenced.
Comment 20 Pierre Ossman 2017-03-16 09:44:18 UTC
Sorry, you're right. I misread this part when it was originally posted:

"Libglvnd relies on a GLX extension to..."

I interpreted it as "Libglvnd relies on <the X11 extension GLX>...", which was obviously present in my test cases.
Comment 21 Timothy Arceri 2018-04-24 01:39:27 UTC
Can we close this?
Comment 22 Pierre Ossman 2018-04-26 13:57:10 UTC
Is there a symlink in place by default now? If so then I'd say the issue is resolved.
Comment 23 Emil Velikov 2018-04-26 16:37:47 UTC
It's a distribution policy to manage the symlink - the same way they did libGL.so in the past.

Perhaps one day we'll get a truly vendor agnostic libGLX_indirect.so, but until then one has to tweak it based on their needs.

As alluded before - this isn't really a Mesa bug :-\

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.