Bug 111339 - [Debug mesa]. Dirt 4 crashes after launching
Summary: [Debug mesa]. Dirt 4 crashes after launching
Status: NEW
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Vulkan/intel (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Caio Marcelo de Oliveira Filho
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords: bisected, regression
Depends on:
Blocks:
 
Reported: 2019-08-09 11:48 UTC by Denis
Modified: 2019-08-18 18:17 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Denis 2019-08-09 11:48:09 UTC
first time was found by @leozinho at https://bugs.freedesktop.org/show_bug.cgi?id=110295

Link to backtrace https://bugs.freedesktop.org/attachment.cgi?id=144988

Trying to bisect bad commit now
Comment 1 Denis 2019-08-09 14:08:46 UTC
hmmm, I am frustrated 8-/

I had a yesterday build with mesa, commit => 207026d29e
And got crashes. Then I started bisection, in the middle of it switched to "release" from "debug", to speed up first game launch. And crashes disappeared.
Ok, I built latest (today) mesa, both, release and debug => 5e38db0c47ca57c6e904f44d0d0e9ef299d14f3c

And I couldn't reproduce the crash. Tried to bisect between yesterday and today's commits (4 steps only) - and was pointed to 100% wrong commit (related to virgl).

Then I re-built mesa on 207026d29e commit again, and now can't reproduce crashes.
I removed mesa shaders cache (~/.cache/mesa...) and also couldn't reproduce.

So there are two assumptions:
1. crash was really fixed in today mesa version
2. crash is flaky, and need to spend more time, to catch it.

I will try to get a gdb backtrace for it
Comment 2 leozinho29_eu 2019-08-10 17:10:35 UTC
I just finished the git bisect. The result is:

b6d475356846f57a034e662ab9245d11ed0dd4a0 is the first bad commit

    nir/large_constants: De-duplicate constants

The log:

git bisect start
# good: [e4e6a3deaff4f84f0fb99b4dec950dc498d507ed] panfrost: Implement FIXED formats
git bisect good e4e6a3deaff4f84f0fb99b4dec950dc498d507ed
# bad: [5a898e2a652843dbb9b013437b0715c3563cafdb] pan/midgard: Disassemble load/store barrel shift
git bisect bad 5a898e2a652843dbb9b013437b0715c3563cafdb
# bad: [5a898e2a652843dbb9b013437b0715c3563cafdb] pan/midgard: Disassemble load/store barrel shift
git bisect bad 5a898e2a652843dbb9b013437b0715c3563cafdb
# bad: [5a898e2a652843dbb9b013437b0715c3563cafdb] pan/midgard: Disassemble load/store barrel shift
git bisect bad 5a898e2a652843dbb9b013437b0715c3563cafdb
# good: [e4e6a3deaff4f84f0fb99b4dec950dc498d507ed] panfrost: Implement FIXED formats
git bisect good e4e6a3deaff4f84f0fb99b4dec950dc498d507ed
# good: [e4e6a3deaff4f84f0fb99b4dec950dc498d507ed] panfrost: Implement FIXED formats
git bisect good e4e6a3deaff4f84f0fb99b4dec950dc498d507ed
# bad: [5a898e2a652843dbb9b013437b0715c3563cafdb] pan/midgard: Disassemble load/store barrel shift
git bisect bad 5a898e2a652843dbb9b013437b0715c3563cafdb
# bad: [5a898e2a652843dbb9b013437b0715c3563cafdb] pan/midgard: Disassemble load/store barrel shift
git bisect bad 5a898e2a652843dbb9b013437b0715c3563cafdb
# bad: [5a898e2a652843dbb9b013437b0715c3563cafdb] pan/midgard: Disassemble load/store barrel shift
git bisect bad 5a898e2a652843dbb9b013437b0715c3563cafdb
# good: [e8917dcadb376168150b36d2390644186724bc25] radv: do not decompress levels without DCC with the compute path
git bisect good e8917dcadb376168150b36d2390644186724bc25
# good: [637b168470190507c89eca8a7d0479103fe236ae] nir/linker: Initialize UniformDataDefaults when using SPIR-V
git bisect good 637b168470190507c89eca8a7d0479103fe236ae
# bad: [58ee973e8737441a78c3ca49d3f8fe9db29447d0] radv/gfx10: do not use the fast depth or stencil clear bytes path
git bisect bad 58ee973e8737441a78c3ca49d3f8fe9db29447d0
# bad: [bfaca7259ca898b5aaab0e592b76eb20e593e9f9] radeonsi/gfx10: deduplicate code for esvert_lds_size
git bisect bad bfaca7259ca898b5aaab0e592b76eb20e593e9f9
# good: [06e5daf5758ffdc06a5a96ab0fe58552732e35d1] spirv_extensions: add list of extensions and to_string method
git bisect good 06e5daf5758ffdc06a5a96ab0fe58552732e35d1
# good: [40e760960319bc8c9ee943c3d8136e23ef474d59] v3d: Fix assertion failures in debug builds.
git bisect good 40e760960319bc8c9ee943c3d8136e23ef474d59
# bad: [09a8a39940ad02951b62454a5d222af669fef694] util: use standard name for strchrnul()
git bisect bad 09a8a39940ad02951b62454a5d222af669fef694
# bad: [e38b93087638781ef83c9b3cc3bb424e448a5380] nir/lower_clip: add a find_clipvertex_and_position_outputs() helper
git bisect bad e38b93087638781ef83c9b3cc3bb424e448a5380
# bad: [d56f92502e21767b7f755fa7a093502b2d01ed91] panfrost: Shrink tiler heap
git bisect bad d56f92502e21767b7f755fa7a093502b2d01ed91
# good: [61098baf42fc0026900a67b86336ad90fc0966a2] freedreno: Convert load_barycentric_at_offset to the NIR lowering helper.
git bisect good 61098baf42fc0026900a67b86336ad90fc0966a2
# good: [0d8a4c67cf44604d648696e007740bd9fa9faa4c] freedreno: Convert nir_lower_tg4_to_tex to the NIR lowering helper.
git bisect good 0d8a4c67cf44604d648696e007740bd9fa9faa4c
# bad: [b6d475356846f57a034e662ab9245d11ed0dd4a0] nir/large_constants: De-duplicate constants
git bisect bad b6d475356846f57a034e662ab9245d11ed0dd4a0
# good: [d9b67ad0796612620b82b7ea11a720735ce7df3f] nir/large_constants: Use ralloc for var_infos
git bisect good d9b67ad0796612620b82b7ea11a720735ce7df3f
# first bad commit: [b6d475356846f57a034e662ab9245d11ed0dd4a0] nir/large_constants: De-duplicate constants

Some of the dmesg errors while bisecting:

[51890.577528] traps: F3DWarmer.2[31485] general protection ip:7fe51bab73c0 sp:7fe4877d0490 error:0 in libvulkan_intel.so[7fe51b819000+526000]
[52467.973758] traps: IdxD3D11_1[3725] general protection ip:7f8b1c6a43c0 sp:7f8a82fd74b0 error:0 in libvulkan_intel.so[7f8b1c406000+526000]
[55452.557728] traps: IdxD3D11_1[1552] general protection ip:7fb3f60623c0 sp:7fb35afde4b0 error:0 in libvulkan_intel.so[7fb3f5dc4000+526000]
[55862.007032] traps: IdxD3D11_1[8383] general protection ip:7f8ee8ea5060 sp:7f8e4ffe04b0 error:0 in libvulkan_intel.so[7f8ee8c07000+526000]
[56103.707751] traps: IdxD3D11_1[13176] general protection ip:7fe334a2b060 sp:7fe29afde4b0 error:0 in libvulkan_intel.so[7fe33478d000+526000]
[57049.055105] traps: IdxD3D11_1[28081] general protection ip:7f499bab7060 sp:7f490d7db4b0 error:0 in libvulkan_intel.so[7f499b819000+526000]
Comment 3 Denis 2019-08-12 08:30:43 UTC
sadly, but I can't reproduce the crash even on specified commit.
Leozinho, did you clean the game cache before running the game? Maybe that could be related to it?

/run/media/manjaro/a244962e-96b2-4c41-a8df-5609424527a0/SteamLibrary/steamapps/shadercache/421020

and ~/.cache/mesa_shader_cache/

I am testing on CFL, in your case, as I remember, SKL. But I don't believe that it is specific to platform... Because I reproduced it also
Comment 4 Denis 2019-08-12 12:23:00 UTC
got 1 crash after about 10 runs, with different mesa versions  (even on bisected commit). So somewhy for me it became random.
>I will try to get a gdb backtrace for it
it was useless because showed only ??? instead of functions
Comment 5 leozinho29_eu 2019-08-12 23:19:55 UTC
I tried deleting the cache directories ~/.cache/mesa_shader_cache/ and steamapps/shadercache/421020. After deleting them, I tested both commits d9b67ad0796612620b82b7ea11a720735ce7df3f (the last good commit) and b6d475356846f57a034e662ab9245d11ed0dd4a0 (the first bad commit). I still got the crash with the bad commit and the game working (but affected by the bug 110295) with the good commit.
Comment 6 Denis 2019-08-15 15:59:49 UTC
:(
I took my SKL. I was able to reproduce the issue on mesa-master - but, after about 5-10 times (crashing it) - it stopped crashing, and launched normally.

Interesting, that even I exchanged mesa libs or not after that, it didn't "load" long time (as you mentioned, first launch with "custom" libs usually tooks about 5-10 minutes).

Also I built "debug" mesa, and tried to get gdb backtrace, and it looks not well.

@Leozinho, could you please build mesa with gdb symbols and try to read gdb?

Below you can find how I built mesa, and my core-dump output:

export CFLAGS='-O0 -ggdb3 -g'
export CXXFLAGS='-O0 -ggdb3 -g'

meson setup . mbuild_dbg_x64 -Dbuildtype=debug --prefix=/home/ubuntu/mesa_versions/mesa-git-15.08 -Dvalgrind=false -Ddri-drivers=i965 -Dgallium-drivers=iris -Dvulkan-drivers=intel -Dgallium-omx="disabled" -Dplatforms=x11,drm,surfaceless -Dtools=intel -Db_ndebug=true

ninja -C ./mbuild_dbg_x64/ install

coredump:

ubuntu@ubuntu:~/mesa$ gdb '/home/ubuntu/.steam/steam/steamapps/common/DiRT 4/bin/Dirt4' '/home/ubuntu/.steam/steam/steamapps/common/DiRT 4/bin/core' 
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/ubuntu/.steam/steam/steamapps/common/DiRT 4/bin/Dirt4...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 28855]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/home/ubuntu/.steam/steam/steamapps/common/DiRT 4/bin/Dirt4'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000001dbf10f in ?? ()
(gdb) bt
#0  0x0000000001dbf10f in  ()
#1  0x0000000001daa8eb in  ()
#2  0x0000000001e5a612 in  ()
#3  0x0000000001d9da45 in  ()
#4  0x0000000001ddd94d in  ()
#5  0x0000000001ddd592 in  ()
#6  0x0000000001d9a6eb in  ()
#7  0x0000000001da5c9f in  ()
#8  0x0000000001d9fbc8 in  ()
#9  0x0000000001d9fefb in  ()
#10 0x0000000001a9252a in  ()
#11 0x0000000001951a9c in  ()
#12 0x000000000195339e in  ()
#13 0x0000000002840cef in  ()
#14 0x00007fd6ac6bf6db in start_thread (arg=0x7fd4ebaa1700) at pthread_create.c:463
#15 0x00007fd6a1f1c88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Comment 7 leozinho29_eu 2019-08-17 03:58:08 UTC
I tried building Mesa with the same settings as yours, using both gcc-7 and gcc-8 (and its g++ counterparts), and in both cases the game worked well, having no crashes. Probably, what is triggering this bug are some different build settings.

My build commands are based on the build settings used on Ubuntu to build Mesa, so the meson command is really, really long:

env PREFIX="/usr/local/mesa64" \
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wall" \
CPPFLAGS="-Wdate-time -D_FORTIFY_SOURCE=2" CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wall" \
FCFLAGS="-g -O2 -fstack-protector-strong" FFLAGS="-g -O2  -fstack-protector-strong" \
GCJFLAGS="-g -O2 -fstack-protector-strong" LDFLAGS="-Wl,-Bsymbolic-functions -Wl,-z,relro" \
OBJCFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security" \
OBJCXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security" \
PKG_CONFIG_PATH="/usr/local/mesa64/lib/pkgconfig" CC=/usr/bin/gcc-8 CXX=/usr/bin/g++-8 /usr/local/bin/meson build/ \
-Dprefix="/usr/local/mesa64"  -Dlibdir="/usr/local/mesa64/lib" \
-Dplatforms="x11,drm,surfaceless" -Ddri3=true  \
-Ddri-drivers="i965" -Dgallium-drivers="iris,swrast,virgl" -Dgallium-vdpau=false -Dgallium-xvmc=false \
-Dgallium-omx=disabled -Dgallium-va=false -Dgallium-xa=false -Dgallium-nine=false -Dgallium-opencl=disabled \
-Dvulkan-drivers="intel" -Dshader-cache=true -Dshared-glapi=true -Dgles1=true -Dgles2=true -Dopengl=true -Dgbm=true \
-Dglx=dri -Degl=true -Dglvnd=true -Dasm=true -Dllvm=true -Dlmsensors=true -Dosmesa=gallium -Dosmesa-bits=8 -Dglx-direct=true

I will try disabling certain settings and removing certain environment variables to check which parameter is causing the crash.
Comment 8 leozinho29_eu 2019-08-17 05:24:11 UTC
After removing that environment variables and setting -Dglx-direct=false (don't know why this was relevant) , I got the following backtrace:

(gdb) bt
#0  0x00007fffb9f9f874 in unsafe_free (info=0x7fffc6486a60) at ../src/util/ralloc.c:297
#1  0x00007fffb9f9f847 in unsafe_free (info=0x7fffc5cdf790) at ../src/util/ralloc.c:292
#2  0x00007fffb9f9f847 in unsafe_free (info=0x7fffc43a3050) at ../src/util/ralloc.c:292
#3  0x00007fffb9f9f768 in ralloc_free (ptr=0x7fffc43a3080) at ../src/util/ralloc.c:262
#4  0x00007fffb9cbb2f7 in anv_pipeline_compile_graphics (pipeline=0x7fffc4a1d320, cache=0x7fff9538bd50, info=0x7fffc4a2d4e0) at ../src/intel/vulkan/anv_pipeline.c:1428
#5  0x00007fffb9cbc7a3 in anv_pipeline_init (pipeline=0x7fffc4a1d320, device=0x7fff9538bf80, cache=0x7fff9538bd50, pCreateInfo=0x7fffc4a2d4e0, alloc=0x7fff9538bf88) at ../src/intel/vulkan/anv_pipeline.c:1911
#6  0x00007fffb9d98618 in gen9_graphics_pipeline_create (_device=0x7fff9538bf80, cache=0x7fff9538bd50, pCreateInfo=0x7fffc4a2d4e0, pAllocator=0x0, pPipeline=0x7fff23fee268) at ../src/intel/vulkan/genX_pipeline.c:2115
#7  0x00007fffb9d9900d in gen9_CreateGraphicsPipelines (_device=0x7fff9538bf80, pipelineCache=0x7fff9538bd50, count=1, pCreateInfos=0x7fffc4a2d4e0, pAllocator=0x0, pPipelines=0x7fff23fee268)
    at ../src/intel/vulkan/genX_pipeline.c:2365
#8  0x00007fffb8ba8078 in  () at /home/usuario/.local/share/Steam/ubuntu12_64/libVkLayer_steam_fossilize.so
#9  0x00007fffba5cf492 in vkCreateGraphicsPipelines () at /usr/local/mesa64/lib/libvulkan.so.1
#10 0x0000000001dd2fb1 in  ()
#11 0x0000000001dd3b6a in  ()
#12 0x0000000001dd41eb in  ()
#13 0x0000000001d9f293 in  ()
#14 0x0000000001d9f69c in  ()
#15 0x000000000196c336 in  ()
#16 0x0000000001a9219a in  ()
#17 0x00000000019517a8 in  ()
#18 0x000000000195339e in  ()
#19 0x0000000002840cef in  ()
#20 0x00007ffff6ef76db in start_thread (arg=0x7fff23fef700) at pthread_create.c:463
#21 0x00007fffec75488f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Comment 9 leozinho29_eu 2019-08-17 17:01:47 UTC
Using -Dosmesa=gallium DiRT 4 crashes even before any stage element can be seen. Using -Dosmesa=none or -Dosmesa=classic, SOMETIMES, I am able to watch the introduction of the stage, see the positions, configure the car, start the race, race a bit and then the game crashes. Sometimes, the game just crashes. 

I noticed I can see when the game is going to crash before the stage loads. If the messages like:

SPIR-V WARNING:
    In file ../src/compiler/spirv/spirv_to_nir.c:826
    Decoration not allowed on struct members: SpvDecorationRestrict
    1388 bytes into the SPIR-V binary

Appear, the game crashes before anything from the stage appears. If these messages do not appear, then I am able to play the race for a few seconds, then the game crashes later.

Hopefully that backtraces provide useful information.
Comment 10 leozinho29_eu 2019-08-18 18:17:17 UTC
After trying enough, I discovered that it's not a specific build setting or environment variable the problem. The problem is the Vulkan version used to build Mesa.

If I use the default version from Ubuntu 18.04, then the game always work, no matter the Mesa git version. However, once I set to build Mesa git using Vulkan git (current Vulkan-Headers version is 23b2e8e64bdf3f25b3d73f1593e72977ebfcd39b and Vulkan-Loader version is fdc5ec43b00e03db432cb8b8bc9bdafc9599c522), then the results from the bisect I did are valid.

I used PKG_CONFIG_PATH when using meson and ninja commands to point to the Vulkan git versions when building Mesa git and doing the bisect.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.