Bug 101408

Summary: [Gen8+] Xonotic fails to render one of the weapons
Product: Mesa Reporter: Ian Romanick <idr>
Component: Drivers/DRI/i965Assignee: Sergii Romantsov <sergii.romantsov>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: andriy.khulap, cefiar, sergii.romantsov
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Gun missing on BDW
Gun rendered correctly on BYT
Vertex shader for the frame where mortar doesn't draw on map
Fragment shader for the frame where mortar doesn't draw on map
Issue with some bot's head 1
Issue with some bot's head 2
Issue with some bot's head 3 config
Issue with some bot's head 4 trace

Description Ian Romanick 2017-06-13 18:45:53 UTC
Created attachment 131929 [details]
Gun missing on BDW

One of the guns in Xontic does not render correctly on BDW.  However, the same gun does render correctly on BYT.  At Ken's suggestion, I tried INTEL_SCALAR_VS=0, but that did not help.

With some additional modifications to the driver, I can add '+r_glsl_skeletal 1' to the Xonotic command line.  This causes Xonotic to use a different animation system that uses UBOs.  In this scenario, the gun is rendered correctly.

I have also tried back as far as Mesa 13.0.0, and the bug still existed there.  I will try older versions.

I have not tried hardware other than BDW and BYT.
Comment 1 Ian Romanick 2017-06-13 18:46:14 UTC
Created attachment 131930 [details]
Gun rendered correctly on BYT
Comment 2 Ian Romanick 2017-06-13 18:49:44 UTC
My Xonotic command line is:

MESA_EXTENSION_OVERRIDE=+GL_EXT_texture_compression_s3tc xonotic-glx \
    +r_texture_dds_load 1 +developer 1 -nohome -benchmarkruns 1 \
    -benchmark demos/the-big-keybench.dem
Comment 3 Kenneth Graunke 2017-06-13 19:16:18 UTC
Confirmed on Skylake as well.

always_flush_batch=true, INTEL_DEBUG=sync, and re-flagging all dirty bits on every draw call don't help.

Another idea is it could be the vertex code - Gen8+ did change a fair bit.

At this point, it's probably best to take an apitrace, dump images for draw calls, see which one went awry, and see what it was doing.
Comment 4 Ian Romanick 2017-06-13 20:14:14 UTC
I have verified that the problem exists as far back as Mesa 10.2.0.  I was unable to get earlier versions to build and run on BDW.
Comment 5 Ian Romanick 2017-06-13 22:12:56 UTC
I feel pretty certain that the problem is not shader related.  I ran the benchmark on the same BDW system while an I/O intensive process was running in the background (rsync to a USB-attached disk), and the gun is correctly rendered /sometimes/.
Comment 6 Stuart Young 2017-09-30 23:31:59 UTC
Present on Kabylake as well.

I've been meaning to log this bug for a while (since ~17.0) but haven't had time or a suitable environment to do a proper investigation (and MESA builds). Technically still don't. :/

I've done a few apitraces on this previously. Have noticed that the first frame always renders, then only the transparent part of the weapon renders (the sights) thereafter.

Just before 0.8.2 was released, some weapon models were updated to a new look, and it's only the new models which seem to have issues. The Mortar model is the one that has the most consistent issue and therefore easily replicated.

Have confirmed that the exact same model/setup renders on non-intel hardware (Radeon and Nvidia) using free and proprietary drivers.

To reproduce this in Xonotic, you can use the following commands without connecting to a server:

Install Xonotic 0.8.2. It has all the necessary maps and the affected weapon models. I recommend installing this from the default download from xonotic.org as it avoids any possible issues with packaged versions. You probably want to run the game and set the graphics resolution and so on to match your display.

Once set up, run Xonotic with the cmdline options of `+map erbium` (eg: ./xonotic-linux-glx.sh +map erbium ). This launches Xonotic and loads the game map `erbium` (a default map included in 0.8.2), which contains the mortar model.

Once in the game, access the console and enter:

prvm_edictset server 1 fixangle \"1\"; prvm_edictset server 1 angles \"0 0 0\"; prvm_edictset server 1 origin \"1324.64856 -1104.02734 449.284454\"

This will take you to the Mortar model on the map `erbium` (the pickup location of the mortar), so you can see issue. The model should be spinning right in front of the viewer after this command is entered.

The command will be in the cmd history in future if you need it, so you don't have to type it in every time. There's a method to do this with Xonotic's autoexec.cfg if you want to get the shortest possible apitrace, though you can't skip the loading screens which adds a lot of extra data to the trace.

If you can't see the model, you're experiencing the bug. You can always use 'r_glsl_skeletal 1' to change the rendering to skeletal animation so you can see that it's there. When you exit, make sure that `r_glsl_skeletal 0`, as the state of the r_glsl_skeletal variable is saved on exit.

Models are in the xonotic-data.pk3 file (in the models/weapons/ directory). The pk3 build files are on github at https://github.com/xonotic/xonotic-data.pk3dir if you just want to pull individual files.

Model files for the mortar (aka grenade launcher): g_gl_luma.iqm  g_gl_luma.iqm_0.skin  g_gl_luma.tga  g_gl.md3  g_gl_simple.iqm  g_gl_simple.iqm_0.skin  g_gl_simple.tga  h_gl.iqm  h_gl.iqm.framegroups  v_gl.md3

Note: Xonotic doesn't use skeletal animation by default as the underlying game engine (darkplaces) has known issues with this mode in certain situations. There's some work going on to port Xonotic to another engine, but that's really slow going and not really relevant to the bug.
Comment 7 Andriy Khulap 2018-02-15 11:13:21 UTC
I am able to reproduce this bug on:
- Intel(R) HD Graphics 530 (Skylake GT2)  (0x191b)
- Intel(R) Core(TM) i5-6440HQ CPU @ 2.60GHz
- latest Debian unstable
- Mesa 17.3.3 (from Debian install) and latest git master 18.1.0-devel (git-aad14cf15a)

The mortar weapon is not visible when equipped, only the aim.
And the mortar is not visible on terrain too, it should float and spin like the other weapons that can be picked up. But it is transparent.
Switching r_glsl_skeletal to 1 makes it visible in both cases.
Comment 8 Andriy Khulap 2018-02-22 15:59:58 UTC
The following in-game options affecting the issue:
(settings->video)
- Vertex Buffer Objects (VBOs) = OFF
- Use OpenGL 2.0 shaders (GLSL) = unchecked

using one or both these options makes mortar visible on map and when handled.

This bug is not present on Haswell and very old 4th gen G45.
On Skylake (the system from message above) bug is present even in mesa-11.0.0-rc1

Recorded apitrace is here: https://drive.google.com/file/d/1C74z4bkxHkhMW_b0g8Upwbs1fEE2_57U/view
the last "good" frame is 543.
Comment 9 Mark Janes 2018-02-22 16:59:25 UTC
Thanks for narrowing this down Andriy.

You can confirm that this is a driver bug (not an app bug) by retracing the working & broken trace files on other drivers/hardware:

 - Intel Windows OpenGL
 - AMD Radeon
 - Nvidia 

If they render better than i965 Mesa, then it is probably a driver bug.

Secondly, you can find the problematic shader using FrameRetrace:

  https://github.com/janesma/apitrace/wiki/frameretrace-branch

For a short demo of how to find the shader for the bad render, watch this video starting around the 10min mark:

 https://fosdem.org/2018/schedule/event/apitrace/
Comment 10 Andriy Khulap 2018-02-23 15:19:01 UTC
Hello Mark,
I will investigate these cases later, but now can share some info about frameretrace usage.

I'm loading the trace and choosing frame 550, where mortar gun should be visible on the ground. Clicking on different "time slots" makes it sometimes visible, but sometimes not. Even consequentially clicking between the same 2 slots makes it appear and disappear. This happens with last time slot also. Found no correlation or dependency here.

In frame 710 where mortar was picked up, the mortar on the ground has the same behavior (sometimes renders, sometimes not), but the mortar in hands is always invisible.
Comment 11 Mark Janes 2018-02-23 18:15:58 UTC
It sounds like you are looking at the game/driver root cause for this bug.

Use "clear before render", "stop at render" and "highlight selected render" to find the render that draws the gun.

Attach the shaders for that render to this bug.

If you are able to understand the glsl, edit the shaders and see if you can change them in a way that causes the weapon to be rendered reliably.
Comment 12 Andriy Khulap 2018-02-27 15:33:24 UTC
Created attachment 137648 [details]
Vertex shader for the frame where mortar doesn't draw on map
Comment 13 Andriy Khulap 2018-02-27 15:34:35 UTC
Created attachment 137649 [details]
Fragment shader for the frame where mortar doesn't draw on map
Comment 14 Andriy Khulap 2018-02-27 15:45:56 UTC
I've found the renders where mortar is drawn for 3 cases:
- frame 550 where gun is on ground;
- frame 710 where gun is in hands;
- frame 710 where gun is in hands and already re-spawned on ground.

The shaders are identical for all cases and differ only with #define VERTEX and #define FRAGMENT. So I've attached only pair of them.
Also I did dump of shaders used in the whole trace (using export MESA_SHADER_DUMP_PATH) and found that their body is the same also. The game uses the same big shader which is configured for different cases using defines.

Unfortunately I didn't find the bug cause at this moment. Will continue investigation.
Comment 15 Stuart Young 2018-02-27 22:18:05 UTC
Just a note here:

This issue doesn't just affect the Mortar weapon. It affects a bunch of other stuff in game, it's just that the Mortar is 100% reliable in tripping the issue.

The other cases are all seem to be if the object is too close to the players viewpoint. This applies to things such as the new rocket launcher model and the player model of another player in-game.

I'm pretty sure that all the weapon models that are affected are animated in some way, which may or may not in some way be related.

PS: I'd love to help but my system is not in a stable or suitable state at the moment for producing anything reliable.
Comment 16 Stuart Young 2018-03-06 23:09:59 UTC
Ok, got my system running well enough to do help debug this.

Created my own trace using the method I describe above, so the trace is smaller at 261M & 392 frames. Trace only focuses on mortar model waiting for pickup on ground, not in players hands.

System is Debian stretch with Kernel 4.14 and Mesa 17.3.6.

trace (xz compressed 143M): https://drive.google.com/file/d/1zLOabG3Ob9yzvdjgMT_E4tnDJVJrLniT/view

Frame 379: First appearance of location of mortar. Mortar rendered.
Frame 380: Mortar fails to render from this frame.

Looking at Frame 380 in FrameRetrace (thanks Mark), found a high spike where the Mortar should render in "N primitives entering clipping" and "N primitives leaving clipping".

From looking at the Metrics, there's no Fragment shader operations, so this is either the Vertex shader or something to do with Textures (which afaik FrameRetrace doesn't give much insight into).

Curiously, if I toggle (tick/untick) "Stop at Render" under the Render target tab, sometimes the preview renders the Mortar (sometimes when ticked, sometimes unticked), and sometimes it doesn't (mostly not).

Notes:
 The mortar model was revamped from the previous model. The new model is more complex and the textures are larger. This makes me wonder if the issue is related to large textures not being rendered properly (eg: not being loaded, etc). I do see a bunch of errors in qapitrace, but I personally have no idea whether any of these could be related or not as I don't really understand them. The Rocket launcher model (viewable in frame 378) was also updated but seems to be unaffected most of the time.
 There's a high spike for "N primitives entering clipping" and "N primitives leaving clipping" for the Rocket Launcher (frame 378) too, so this may be a false positive, but it definitely highlights the render frame for the weapon.
 Toggling "Stop at Render" for the Rocket Launcher in frame 378 always renders the rocket launcher.
 I could not get FrameRetrace to render frame 379. It goes through the process, but gives me back a blank report with no values or metrics. It's the first frame rendered at the new location, and where the mortar is initially visible. I can do this in qapiretrace (and get a thumbnail), but then I don't get as useful diagnostics, especially so I can compare frame 379 and 380 to see what might have changed.
Comment 17 Stuart Young 2018-03-11 03:17:48 UTC
I've been able to use frameretrace & frameretrace_server to play my previously posted apitrace on a different machine.

The mortar is always rendered, so this to me genuinely seems like a mesa/intel bug, not a game bug.

Hardware/Software I used:
 Device: Toshiba Protege Z930 laptop
 CPU: Intel Core i7-3687U
 Distro: Debian buster
 Kernel: 4.14.0-3-amd64
 Mesa: 17.3.6
 OpenGL Renderer: Mesa DRI Intel Ivybridge Mobile

Note: While people have confirmed that the bug isn't present on older intel hardware (or nvidia/radeon), running an apitrace taken from buggy hardware and replaying it on working hardware doesn't appear to have been previously done.
Comment 18 Andriy Khulap 2018-03-12 07:24:50 UTC
Stuart, I did it during Comment 8, sorry for not noticing about this case.
"Buggy" apitrace recorded on skylake was played without issues on haswell.

BTw, the game logs with developer mode enabled for these platforms were identical (glsl versions used... almost everything was same except the timestamps).
For Gen 4, game log shows major differences (like glsl #version 120 uses...) so I didn't tried the skylake apitrace there.
Comment 19 Stuart Young 2018-03-12 11:28:20 UTC
Thanks Andriy. I obviously missed that.

I played back the trace I took on a different machine using non-intel based video (just using qapitrace) and it rendered the mortar as well.

Graphics:

Card: NVIDIA GF108 [GeForce GT 440]
Display Server: X.Org 1.18.4
Drivers: nvidia (unloaded: fbdev,vesa,nouveau)
GLX Renderer: GeForce GT 440/PCIe/SSE2
GLX Version: 4.5.0 NVIDIA 384.111
Comment 20 Stuart Young 2018-03-13 21:41:55 UTC
Further notes:

My trace was taken on the following hardware:

Graphics:
 Card: Intel Device 5912
 Display Server: X.Org 1.19.2 drivers: intel (unloaded: modesetting,fbdev,vesa)
 Resolution: 1920x1080@60.00hz
 GLX Renderer: Mesa DRI Intel HD Graphics 630 (Kaby Lake GT2)
 GLX Version: 3.0 Mesa 17.3.6

I have also replayed this trace on an old ATI Radeon machine running Debian Buster (with the non-free firmware-amd-graphics blob installed) and the mortar is rendered there also.

The ATI hardware:
Graphics:
 Card: Advanced Micro Devices [AMD/ATI] Cedar [Radeon HD 5000/6000/7350/8350 Series]
 Display Server: wayland (X.Org 1.19.6 )
 drivers: ati,vesa (unloaded: modesetting,fbdev,radeon)
 Resolution: 1920x1080@59.96hz
 OpenGL: renderer: AMD CEDAR (DRM 2.50.0 / 4.14.0-3-amd64, LLVM 5.0.1)
 version: 3.3 Mesa 17.3.6
Comment 21 Stuart Young 2018-03-27 07:02:32 UTC
Patch provided by Sergii Romantsov (posted to mesa-dev at https://lists.freedesktop.org/archives/mesa-dev/2018-March/190147.html ), patched against mesa 17.3.7 resolves all the graphical artefact issues for me.

This not only means the issues with the mortar on the ground, but in hand, and other issues with player models that vary depending how far away they are.
Comment 22 Stuart Young 2018-03-28 22:30:24 UTC
While the Mortar issue is definitely resolved by the patch Sergii provided, I've thought of a few more tests regarding player models that I'm going to run over the next few days.

The player model issues appear to be specific distance/line of sight/viewing angle related so I want to make sure if this fixes the issue or if it's a different bug.
Comment 23 Mark Janes 2018-03-29 05:21:55 UTC
It would be excellent to have a piglit test that exercises this issue.
Comment 24 Sergii Romantsov 2018-03-29 13:24:31 UTC
Created attachment 138423 [details]
Issue with some bot's head 1
Comment 25 Sergii Romantsov 2018-03-29 13:24:56 UTC
Created attachment 138424 [details]
Issue with some bot's head 2
Comment 26 Sergii Romantsov 2018-03-29 13:25:21 UTC
Created attachment 138425 [details]
Issue with some bot's head 3 config
Comment 27 Sergii Romantsov 2018-03-29 13:28:09 UTC
Created attachment 138426 [details]
Issue with some bot's head 4 trace
Comment 28 Sergii Romantsov 2018-03-29 13:33:22 UTC
According to Stuart's comment (https://bugs.freedesktop.org/show_bug.cgi?id=101408#c22) i made investigation and observed next:

0. Mesa is with my applied patch: https://lists.freedesktop.org/archives/mesa-dev/2018-March/190147.html
1. Case 1: with default configuration and no one defined extra variables.
Result 1: almost everything works fine, except Bots that uses 'ignismasked' Models. For almost any place if we are looking on bot too far it always doesn't have a 'head' (see Defaults_BotWithoutHead_1.jpg ('Issue with some bot's head 1') and Defaults_BotWithoutHead_2.jpg ('Issue with some bot's head 2'))
2. Case 2: To increase appearance of 'head'-issue need to do next:
 2.1 Add into default confiration (~/.xonotic/data/config.cfg) fields (that allows to have all bots with problematic Model/Skin):
seta "sv_defaultcharacter" "1"
seta "sv_defaultplayermodel" "models/player/ignismasked.iqm"
 or just replace it with attached file config.cfg ('Issue with some bot's head 3 config')
 2.2 Before launching Xonotic do export:
export MESA_DEBUG="flush"
 2.3 Run Xonotic (it can be 0.8.2):
xonotic-linux64-glx
Result 2: with almost any bot we can catch issue pretty simple.
3. Case 3: without my patch.
Result 3: 'head' is not shown at all.

Also attached trace (with patch): xonotic-linux64-glx_head.trace ('Issue with some bot's head 4 trace')

System:    Kernel: 4.13.0-37-generic x86_64 (64 bit gcc: 5.4.0)
           Desktop: Unity 7.4.0 (Gtk 3.18.9-1ubuntu3.3) Distro: Ubuntu 16.04 xenial
CPU:       Dual core Intel Core i7-7500U (-HT-MCP-) cache: 4096 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 11616
           clock speeds: max: 3500 MHz 1: 2900 MHz 2: 2900 MHz 3: 2900 MHz 4: 2900 MHz
Graphics:  Card-1: Intel Device 5916 bus-ID: 00:02.0
           GLX Renderer: Mesa DRI Intel HD Graphics 620 (Kaby Lake GT2)
           GLX Version: 3.0 Mesa 18.1.0-devel (git-1e36fe5dc4) Direct Rendering: Yes

------
For Skylake is actual only 'Case 1' and with 'Case 3' only some artifacts/color switches are present:
System:    Kernel: 4.15.0-12-generic x86_64 (64 bit gcc: 7.3.0)
           Desktop: Gnome 3.2.8 (Gtk 3.22.28-1ubuntu3.3) Distro: Ubuntu Bionic Beaver (development branch)
CPU:       Dual core Intel Core i3-6006U (-MT-MCP-) cache: 3072 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 7968
Graphics:  Card: Intel HD Graphics 520 bus-ID: 00:02.0
           OpenGL: renderer: Mesa DRI Intel HD Graphics 520 (Skylake GT2)
           GLX Version: 4.5 Mesa 18.1.0-devel (git-1e36fe5dc4) Direct Render: Yes

-----
So i would say that its very different issue from Mortar rendering.
Comment 29 Stuart Young 2018-03-30 03:18:46 UTC
Hi All,

The issues with the ignismasked model in Xonotic appears to be a model bug and nothing at all to do with other issues described.

Have had it confirmed that the ignismasked model issue (head disappearing) happens on other cards/platforms (Win 10 Pro 64, Nvidia GTX970) by one of the developers of Xonotic, so it's definitely not related to Mesa.
Comment 30 Stuart Young 2018-04-03 22:50:22 UTC
Tested latest version of the patch provided by Sergii here (in place of original patch): https://lists.freedesktop.org/archives/mesa-dev/2018-April/190767.html

This also solves the issue of the original bug (built against 17.3.7, patch moved to line 954 as original patch is against 18.0 tree).
Comment 31 Kenneth Graunke 2018-04-04 05:49:18 UTC
Fixed by the following commit in master:

commit 98b860e3115ff937152dbf4c843e1ecb9244734c
Author: Sergii Romantsov <sergii.romantsov@gmail.com>
Date:   Mon Apr 2 09:59:06 2018 +0300

    i965: Extend the negative 32-bit deltas to 64-bits
    
    Gen8+ use 48-bit address relocations so need to extend the sign
    to 64-bit return value. Without it we have higher bits zeroed
    and missing the negavive values.
    Haswell and older use 32-bit deltas so are unaffected by this issue.
    
    v2:
      used int32_t fucntion parameter instead of explicit type conversion.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408
    Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
    Tested-by: Andriy Khulap <andriy.khulap@globallogic.com>
    Tested-by: Stuart Young <cefiar@gmail.com>
    Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.