Bug 107990

Summary: Got Dying Light working in Arch by changing Mesa's compile steps, how to get it working Out Of the Box?
Product: Mesa Reporter: John <john.ettedgui>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium CC: bugs.freedesktop, henrik.holst2, magist3r
Version: 18.2   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: LD_PRELOAD shim

Description John 2018-09-19 14:25:21 UTC
Hello,

I've been looking for a while on how to get DL running on Arch and I finally found it.

Now I'd like this to be possible by default, but I don't know if the issue lies in Arch or Mesa and I'm hopeful someone here will be able to help.

Arch's default compiler flags are:

CFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt"
CXXFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt"
LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now"

"-fno-plt" in C/CXXFLAGS and ",-z,now" in LDFLAGS need to be unset when compiling Mesa to get DL to not crash.

The other change required is to not use glvnd, with "-D glvnd=false". I tried building libglvnd with the same flags as Mesa, or none at all, but it didn't help.

Here's the original 18.2.0 PKGBUILD in case it helps: https://git.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/mesa

The behavior is the same with Mesa 18.2.0 or master, with LLVM 6.0 or master.
I haven't tried with autoconf though, only Meson, but I can try if you think it'd be helpful.

I don't have any helpful log, as the game segfaults without these, a backtrace in gdb doesn't show me anything useful, and there's nothing in dmesg.

I'm using a 280X but people with 5xx seem to have the same behavior, so I don't think it depends on the model. We're all on amdgpu.

I'd be happy to try / provide whatever would help.

Current versions: Linux 4.18.8, GCC 8.2.1, glibc 2.28, but it failed in previous ones too.

Thank you!
Comment 1 John 2018-09-19 23:13:51 UTC
I had someone else try this, and it worked for him as well with a 580.
Comment 2 Timothy Arceri 2018-09-20 02:50:41 UTC
When you say working I assume you mean it no longer crashes at start up?

Have you reported this to the packages of Mesa on Arch?

fno-plt is a gcc optimisation flag that is not applied to Mesa by default, issues with this option should be reported to distro maintainers.
Comment 3 John 2018-09-20 04:12:41 UTC
> When you say working I assume you mean it no longer crashes at start up?

Correct, sorry about the lack of clarity.
It still takes quite a while on the black screen with the white line, but it eventually passes it and goes in game.

> Have you reported this to the packages of Mesa on Arch?

> fno-plt is a gcc optimisation flag that is not applied to Mesa by default, issues with this option should be reported to distro maintainers.

I have not yet, because I am not convinced about the distribution reverting compile flags when a single proprietary game does not run with them; but if you think I should, I'll do it right now (https://bugs.archlinux.org/task/60130).

That leaves the the glvnd issue, which of course might be because of the same flags applied to I don't know which other package.

Thank you!
Comment 4 Thomas Crider 2018-09-28 01:03:31 UTC
Just an update on this, you don't need to change Arch's compiler flags.
I compiled mesa with the default flags and used glvnd=false, then used LD_PRELOAD for just libGL.so.1.2.0 and libglapi.so.0.0.0 and the game ran. I was able to copy just those two alone to the Dying Light install directory and LD_PRELOAD them from the directory. The problem is definitely an issue with glvnd and mesa-libgl
Comment 5 Timothy Arceri 2018-10-18 23:32:29 UTC
(In reply to Thomas Crider from comment #4)
> Just an update on this, you don't need to change Arch's compiler flags.
> I compiled mesa with the default flags and used glvnd=false, then used
> LD_PRELOAD for just libGL.so.1.2.0 and libglapi.so.0.0.0 and the game ran. I
> was able to copy just those two alone to the Dying Light install directory
> and LD_PRELOAD them from the directory. The problem is definitely an issue
> with glvnd and mesa-libgl

Still seem like some kind of distro issue as Fedora uses glvnd and I don't have any issue here.
Comment 6 Timothy Arceri 2018-11-29 04:21:28 UTC
*** Bug 106287 has been marked as a duplicate of this bug. ***
Comment 7 Aurryon Schwartz 2019-01-12 22:38:49 UTC
Good evening everyone,

I'm having the same problem as mentioned and disabling glvnd in the build process prevented the segfault during the loading time of DL.

Here are my specs:
- Radeon WX 3100
- Vega 64 (used with DRI_PRIME env)
- Ubuntu 18.10

Glxinfo:

OpenGL renderer string: Radeon RX Vega (VEGA10, DRM 3.27.0, 4.19.4-041904-lowlatency, LLVM 8.0.0)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 19.0.0-devel - padoka PPA
OpenGL core profile shading language version string: 4.50

Default Ubuntu 18.10 compile flags:
staraurryon@Aurryon-Desktop:~$ dpkg-buildflags 
CFLAGS=-g -O2 -fdebug-prefix-map=/home/staraurryon=. -fstack-protector-strong -Wformat -Werror=format-security
CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2
CXXFLAGS=-g -O2 -fdebug-prefix-map=/home/staraurryon=. -fstack-protector-strong -Wformat -Werror=format-security
FCFLAGS=-g -O2 -fdebug-prefix-map=/home/staraurryon=. -fstack-protector-strong
FFLAGS=-g -O2 -fdebug-prefix-map=/home/staraurryon=. -fstack-protector-strong
GCJFLAGS=-g -O2 -fdebug-prefix-map=/home/staraurryon=. -fstack-protector-strong
LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro
OBJCFLAGS=-g -O2 -fdebug-prefix-map=/home/staraurryon=. -fstack-protector-strong -Wformat -Werror=format-security
OBJCXXFLAGS=-g -O2 -fdebug-prefix-map=/home/staraurryon=. -fstack-protector-strong -Wformat -Werror=format-security

Additional build flags in debian/rules:
ifeq (,$(filter $(DEB_HOST_ARCH), armhf sh3 sh4))
buildflags = \
	$(shell DEB_CFLAGS_MAINT_APPEND=-Wall DEB_CXXFLAGS_MAINT_APPEND=-Wall dpkg-buildflags --export=configure)
else
  ifneq (,$(filter $(DEB_HOST_ARCH), armhf))
  # Workaround for a variant of LP: #725126
  buildflags = \
	$(shell DEB_CFLAGS_MAINT_APPEND="-Wall -fno-optimize-sibling-calls" DEB_CXXFLAGS_MAINT_APPEND="-Wall -fno-optimize-sibling-calls" dpkg-buildflags --export=configure)
  else
  # Workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83143
  buildflags = \
	$(shell DEB_CFLAGS_MAINT_APPEND="-Wall -O1" DEB_CXXFLAGS_MAINT_APPEND="-Wall -O1" dpkg-buildflags --export=configure)
  endif
endif

I also tried to disable -Wl,-Bsymbolic-functions but without success as it was not mentioned in RedHat build documentation.

If you have any comment on my infos, please don't hesitate.
I have started to investigate Fedora build flags as @Timothy Arceri mentioned that there was no issue in Fedora.

GL HF

Regards,
Aurryon
Comment 8 John Brooks 2019-01-19 19:29:34 UTC
Created attachment 143165 [details]
LD_PRELOAD shim

I have investigated the issue and found the cause of the problem.

First, some background. The game uses __GLEW_EXT_direct_state_access to detect whether the EXT_direct_state_access OpenGL extension is available. In experimental mode (glewExperimental=GL_TRUE), GLEW implements those macros by calling glXGetProcAddress() for every function that an extension is supposed to provide. If the driver returns non-NULL for every function in that set, then the macro evaluates to true. And if the macro evaluates to true, then the game will use EXT_direct_state_access functions such as glMapNamedBufferRangeEXT.

Mesa does not implement EXT_direct_state_access, and returns NULL when its functions are queried with GetProcAddress(). GLVND, however, provides stubs for all OpenGL functions, and returns the stub in glXGetProcAddress(). The stub will eventually end up calling out to the vendor's (Mesa's) implementation of the function when a context is created and the vendor becomes known. The upshot of this is that glvnd will return a stub for every function in EXT_direct_state_access, leading the GLEW macro to return true, and the game will attempt to use those extension functions.

The trouble begins when the game tries to use glMapNamedBufferRangeEXT(). This function is supposed to return a pointer to a memory area that maps to an OpenGL buffer. Since Mesa does not implement EXT_direct_state_access, the GLVND stub for glMapNamedBufferRangeEXT remains a no-op, and the return value is undefined. The game tries to write to the pointer that is returned (which of course is not valid as the function was a no-op), and it segfaults.

I brought this up in #dri-devel, and imirkin pointed me to this page (https://dri.freedesktop.org/wiki/glXGetProcAddressNeverReturnsNULL/) which explains that checking for NULL from glXGetProcAddress is not a reliable way to determine if a function is supported. GLEW's experimental mode is doing this and reporting that extensions are supported when they are not. The reason this happens with GLVND and not Mesa alone is because Mesa alone returns NULL for these functions. With Mesa alone, the game calls glMapNamedBufferRange instead of glMapNamedBufferRangeEXT, which is just fine because Mesa implements the former.

I'm not sure what to do to fix this at the driver level without being too hacky. I've attached code for an LD_PRELOAD shim, it should be very simple to compile (see comments in the file). This isn't ideal as it requires end users to find and compile this, but it's something for now. The game's developers could fix it by setting glewExperimental to GL_FALSE or improving their logic for deciding when to use EXT_direct_state_access. Or maybe not using it at all, since I'm not really sure what the point of using that extension is anyway, even if it is available.
Comment 9 Timothy Arceri 2019-01-31 08:26:30 UTC
Thanks for investigating. The brute force fix is to finish implementing EXT_direct_state_access. I have a partial implementation here which is able to run Doom and Wolfenstein with this extension enabled[1].

Implementing the feature in Mesa shouldn't be all that difficult. The hardest part will be writing all the piglit tests before this will be accepted in Mesa. Currently I don't see myself working on this anytime in the near future, so anyone is free to try pick this up. 

[1] https://gitlab.freedesktop.org/tarceri/mesa/commits/EXT_direct_state_access
Comment 10 John 2019-01-31 10:45:12 UTC
Hey Timothy,

is this something fairly easy to get into for someone with no knowledge of OpenGL or Mesa (I've had a patch 5 years ago, so pretty much the same as no knowledge)? If so, unless John wants to do it, I would be willing to try.
Comment 11 Timothy Arceri 2019-02-01 12:09:50 UTC
(In reply to John from comment #10)
> Hey Timothy,
> 
> is this something fairly easy to get into for someone with no knowledge of
> OpenGL or Mesa (I've had a patch 5 years ago, so pretty much the same as no
> knowledge)? If so, unless John wants to do it, I would be willing to try.

It will mean a lot of reading/learning but I think much of the extension could be implemented by someone with no previous OpenGL knowledge. A bunch of the support already exists due to ARB_direct_state_access being supported in Mesa, things just need to be adjusted to the differences in the specs. Again the harder part is probably writing all tests.

The extension is quite large as it touches a load of different features so it's definitely something that could be shared across multiple developers.
Comment 12 Timothy Arceri 2019-06-29 01:42:12 UTC
A bunch of EXT_direct_state_access function have now been implemented in mesa master, it's likely that Dying Light now works. Would be good if someone can give it a test.
Comment 13 John 2019-07-01 16:35:01 UTC
Yup that worked, thank you!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.