Bug 105775 - SI reaches the maximum IB size in dwords and fail to submit
Summary: SI reaches the maximum IB size in dwords and fail to submit
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Vulkan/radeon (show other bugs)
Version: 17.3
Hardware: x86-64 (AMD64) Linux (All)
: medium blocker
Assignee: mesa-dev
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-28 01:50 UTC by Amarildo
Modified: 2018-04-20 16:15 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
One of the dumps generated by the game. (39 bytes, text/plain)
2018-03-28 01:50 UTC, Amarildo
Details
New dmp (981.42 KB, application/octet-stream)
2018-03-29 10:35 UTC, Amarildo
Details
dmp when starting a race (1.10 MB, application/octet-stream)
2018-03-29 11:08 UTC, Jacob
Details

Description Amarildo 2018-03-28 01:50:22 UTC
Created attachment 138388 [details]
One of the dumps generated by the game.

F1 2017 starts on GCN 1.0 and I can do "Time Trial" just fine. Performance is really good, seems above 60 FPS on "High" settings.

However, when on an actual race the game crashes when leaving the pits. It works fine at the garage, while driving through the pits, but right after leaving the pits it crashes and gives me the following error:

https://www.gamingonlinux.com/uploads/articles/article_media/4340454031509720557gol714.png

Just note that the image above is not from my computer. The error is the same, though.

I'm running:

Debian Jessie x86_64
Kernel 4.15.4-1
Mesa 17.3.7
amdgpu Kernel driver
mesa-vulkan-drivers 17.3.7
vulkan-utils 1.1.70
AMD FX 6300 @4.4 GHz
AMD Sapphire R9 270X
Comment 1 Samuel Pitoiset 2018-03-28 09:14:04 UTC
Hi,

Can you build mesa with debug symbols and attach a backtrace?
Comment 2 Amarildo 2018-03-28 16:21:05 UTC
I'll try, although I'm just a regular user ;-) 

Which exact package do I need to rebuild? I'd think it's not necessary to re-build everything, perhaps just "mesa-vulkan-drivers"?
Comment 3 Samuel Pitoiset 2018-03-28 19:04:14 UTC
Well, the main problem is that I don't have any GCN 1.0 cards and when I tried on Polaris it didn't crash...

You will need to clone mesa from https://cgit.freedesktop.org/mesa/mesa/ and built it. Let me know if you need help.
Comment 4 Amarildo 2018-03-28 21:43:10 UTC
OK, thanks :)

I've looked into this[1] short explanation, but I have no "Make-config" file after cloning mesa.

[1] https://www.mesa3d.org/debugging.html

Is there another file I can put "-DDEBUG" to?
Comment 5 Bas Nieuwenhuizen 2018-03-28 23:20:37 UTC
It might be easier to install packages with the debug symbols from your distro:

https://wiki.debian.org/HowToGetABacktrace

though I don't know offhand which debian package contains radv.

Otherwise try

CFLAGS="-O0 -g" ./autogen.sh --with-gallium-drivers= --with-dri-drivers= --with-egl-platforms=x11,drm  --enable-debug --with-vulkan-drivers=radeon

and then run the game with "set launch options" set to

VK_ICD_FILENAMES=${MESA_DIR}/src/amd/vulkan/dev_icd.json %command%

with the ${MESA_DIR} replaced by the git clone.
Comment 6 Amarildo 2018-03-28 23:34:26 UTC
Thanks Bas. It seems "mesa-vulkan-drivers" has debug symbols enabled for that dbg package[1]. I'll download it then run the game and attach any log files here. If that's not enough (e.g. if radv is not present in mesa-vulkan-drivers) I'll rebuild all the packages and then redo the process.

---------------------------------------------

[1]
[code]amarildo@amarildo:~$ apt-cache search mesa | grep dbg
libglw1-mesa-dbgsym - Debug symbols for libglw1-mesa
libegl-mesa0-dbgsym - debug symbols for libegl-mesa0
libgl1-mesa-dri-dbgsym - debug symbols for libgl1-mesa-dri
libglapi-mesa-dbgsym - debug symbols for libglapi-mesa
libglx-mesa0-dbgsym - debug symbols for libglx-mesa0
libosmesa6-dbgsym - debug symbols for libosmesa6
libwayland-egl1-mesa-dbgsym - debug symbols for libwayland-egl1-mesa
mesa-opencl-icd-dbgsym - debug symbols for mesa-opencl-icd
mesa-va-drivers-dbgsym - debug symbols for mesa-va-drivers
mesa-vdpau-drivers-dbgsym - debug symbols for mesa-vdpau-drivers
mesa-vulkan-drivers-dbgsym - debug symbols for mesa-vulkan-drivers
mesa-utils-dbgsym - debug symbols for mesa-utils
mesa-utils-extra-dbgsym - debug symbols for mesa-utils-extra
[/code]
Comment 7 Dave Airlie 2018-03-29 01:02:14 UTC
Just FYI, 

Tahiti GPU, no crash here, I did one lap of Melbourne and entered the pits and exited again.
Comment 8 Amarildo 2018-03-29 01:08:07 UTC
Meanwhile, could you test with Firejail?

On Arch Linux

pacman -S firejail

On Debian/Ubuntu/Family

apt install firejail

Then edit:

/etc/firejail/steam.profile

and comment the following lines:

#seccomp
#private-dev

The game didn't start at first, I had to comment the 'seccomp' line. I'm afraid it (firejail) has something to do with the crash, but I'm not sure.

I installed "mesa-vulkan-drivers-dbgsym" and ran Steam inside gdb and firejail (via "STEAM_DEBUGGER=gdb firejail --allow-debuggers steam") but it was like I wasn't debugging it at all.

So if you may, please install and run steam within firejail to see if that causes F1 2017 to crash.

To run Steam through firejail, after commenting the necessary lines above in it's firejail profile, do:

firejail steam

Thanks
Comment 9 Amarildo 2018-03-29 01:09:35 UTC
BTW, how did you do that lap? Because if you're alone, e.g. in a time-trial event, the game runs fine. It's when running a e.g. Race Weekend and coming out of the pits that the game crashes.
Comment 10 Dave Airlie 2018-03-29 01:14:39 UTC
I did a championship lap, there were no other cars on the screen as I'm no good at the game, they were in the lap somewhere.
Comment 11 Amarildo 2018-03-29 01:51:31 UTC
Firejail isn't the issue. Ran Steam outside of it.

Then I tried  compiling mesa with the above suggestion, it says "configure: error: --enable-llvm is required when building radv", and when I enable it, it says "configure: error: --enable-llvm selected but llvm-config is not found", and no such file exists.

This is so frustrating.

Gaming on Linux with Pitcairn has never been easy, and AMD's support of my card has always been lacking, delayed, and problematic. Sadly, it's times like these that make me wanna go to Windows, everything "just works there", drivers and games are actually well tested, and regular users never need to debug themselves and compile programs to test stuff.
Comment 12 Dave Airlie 2018-03-29 03:11:20 UTC
-2 looks like out of device memory, you might have the game settings up to high, or too high resolution.
Comment 13 Amarildo 2018-03-29 03:19:05 UTC
Usually I test all my games on max graphical settings, that includes X-Plane, Project Cars 2, GTA V, Far Cry 4, and so on, all on 1080p with no crashes and usually close to 60 FPS. VRAM surely runs high, so I tend to use a combination of High/Very High and Ultra settings to get constant 60 FPS on these games.

While testing F1 2017 on Linux, the game ran fine at "High" settings while driving by myself.

I also tested it with all graphical options on the lowest possible and 720p, but it still crashes.

Maybe Vulkan handles VRAM diferently? Surely the game shouldn't use 2 GB of VRAM while all graphics settings are on "Ultra Low" and 720p.
Besides, the R9 270X ran the game fine while using the amdgpu-pro driver, as per Phoronix results. So I personally don't see VRAM getting full, but I'll still try to find a way of monitoring VRAM usage and will try the game again.
Comment 14 Alex Smith 2018-03-29 08:18:36 UTC
The crash is in the game code so I don't think a Mesa backtrace would help. It looks like the dump file hasn't been attached properly - could you re-attach it so I can look at what call is failing?
Comment 15 Amarildo 2018-03-29 10:34:02 UTC
Ah, I thought it was a RADV problem.

I'll attach a new dump bellow.
Comment 16 Amarildo 2018-03-29 10:35:10 UTC
Created attachment 138415 [details]
New dmp
Comment 17 Alex Smith 2018-03-29 10:42:05 UTC
Thanks.

The failing call is vkEndCommandBuffer. That matches with a few other crashes we have logged on GCN 1.0 cards.

RADV devs, any ideas what might cause that to happen for 1.0 cards specifically? We've never seen it on newer cards. I doubt it would be really running out of VRAM when running on lowest settings at 720p.

Dave, are you able to reproduce it if you run the benchmark at high settings or something like that?
Comment 18 Jacob 2018-03-29 11:08:29 UTC
Created attachment 138416 [details]
dmp when starting a race

Just to chime in, when running everything on low at 1600x900 windowed, the game crashes for me too on a 270x with 4GB of VRAM, so if this dmp contains the same failing call, it sounds extremely unlikely if it's due to too little VRAM.
Comment 19 Alex Smith 2018-03-29 11:11:20 UTC
Yes, same call failing there too.
Comment 20 Samuel Pitoiset 2018-03-29 18:37:45 UTC
FYI, I can actually reproduce the crash on Polaris, I will investigate tomorrow.

Thansk for all the details.
Comment 21 Samuel Pitoiset 2018-03-30 16:08:07 UTC
That's a critical issue. Only SI is affected but the problem can be reproduced with recent chips and RADV_DEBUG=noibs. I have started to work on this but the fix isn't yet working.
Comment 22 Amarildo 2018-03-30 18:14:39 UTC
Thanks, Samuel. Looking forward to the fix.

If there's anything I can do to speed testing, let me know.
Comment 23 Samuel Pitoiset 2018-04-20 12:23:51 UTC
Here's the fix https://patchwork.freedesktop.org/patch/218066/

I have tested it using RADV_DEBUG=noibs with Dawn Of War III, that seems to work.
Comment 24 Samuel Pitoiset 2018-04-20 16:15:20 UTC
Should be fixed with https://cgit.freedesktop.org/mesa/mesa/commit/?id=fedd0a4215bcd387525000d76b77993ca38916ae

The limitation of 4 IBs per submission is quite annoying but I will fix later on in master (don't expect any backports because this would require a new libdrm release).

Thanks!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.