Bug 108355 - Civilization VI - Artifacts in mouse cursor
Summary: Civilization VI - Artifacts in mouse cursor
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/AMDgpu (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: xf86-video-ati maintainers
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-13 22:34 UTC by Hadrien Nilsson
Modified: 2018-11-27 11:26 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
expected and actual cursor texture (57.04 KB, image/jpeg)
2018-10-13 22:34 UTC, Hadrien Nilsson
no flags Details
dmesg after reboot (69.29 KB, text/plain)
2018-10-16 20:26 UTC, Hadrien Nilsson
no flags Details
Xorg log after reboot and started briefly the game in windowed mode (73.46 KB, text/plain)
2018-10-16 20:28 UTC, Hadrien Nilsson
no flags Details
Xorg log with amdgpu_drv 18.1.0 (69.67 KB, text/plain)
2018-10-18 19:07 UTC, Hadrien Nilsson
no flags Details
Detect and fix up non-premultiplied cursor data (4.18 KB, patch)
2018-10-19 09:18 UTC, Michel Dänzer
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Hadrien Nilsson 2018-10-13 22:34:14 UTC
Created attachment 142017 [details]
expected and actual cursor texture

I'm on Ubuntu 18.04.1 with a RX 480 and Mesa 18.2.2

The mouse cursor texture is not correctly displayed. There are several artifacts, like a left vertical missing line, bright magenta, green and cyan pixels (see the attached image).

uname -a:

Linux c18 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


glxinfo -B:

name of display: :0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: X.Org (0x1002)
    Device: AMD Radeon (TM) RX 480 Graphics (POLARIS10, DRM 3.23.0, 4.15.0-36-generic, LLVM 7.0.0) (0x67df)
    Version: 18.2.2
    Accelerated: yes
    Video memory: 8192MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.5
    Max compat profile version: 4.4
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
Memory info (GL_ATI_meminfo):
    VBO free memory - total: 7781 MB, largest block: 7781 MB
    VBO free aux. memory - total: 8176 MB, largest block: 8176 MB
    Texture free memory - total: 7781 MB, largest block: 7781 MB
    Texture free aux. memory - total: 8176 MB, largest block: 8176 MB
    Renderbuffer free memory - total: 7781 MB, largest block: 7781 MB
    Renderbuffer free aux. memory - total: 8176 MB, largest block: 8176 MB
Memory info (GL_NVX_gpu_memory_info):
    Dedicated video memory: 8192 MB
    Total available memory: 16384 MB
    Currently available dedicated video memory: 7781 MB
OpenGL vendor string: X.Org
OpenGL renderer string: AMD Radeon (TM) RX 480 Graphics (POLARIS10, DRM 3.23.0, 4.15.0-36-generic, LLVM 7.0.0)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.2.2 - padoka PPA
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.4 (Compatibility Profile) Mesa 18.2.2 - padoka PPA
OpenGL shading language version string: 4.40
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile

OpenGL ES profile version string: OpenGL ES 3.2 Mesa 18.2.2 - padoka PPA
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
Comment 1 Hadrien Nilsson 2018-10-16 05:56:01 UTC
I start thinking I'm not in the right Bugzilla section as OpenGL does not handle mouse cursors. Sorry if that the case, I would love to know the exact faulting component in the graphics stack.

I do not know if this is a SDL, X11, drm, amdgpu or hardware problem, or the game itself (some kind of surface corruption). But Gnome screenshot program is able to correctly retrieve the cursor image as shown in the attachment.

I wrote a small SDL program that changes my mouse cursor, as this is what Civ6 seems to use, but everything works fine.

I contacted Aspyr support but their response was a dead-end : "AMD an Intel GPUs aren't supported".

I tried to make the game use my own compiled SDL version with no luck. Steam games seem to use some kind of sandbox, LD_PRELOAD seems to be ignored. Or the game may not actually use SDL for the mouse cursor though the related symbols are referenced in the executable.
Comment 2 Michel Dänzer 2018-10-16 10:48:47 UTC
Please attach the corresponding Xorg log file and output of dmesg.
Comment 3 Hadrien Nilsson 2018-10-16 20:26:20 UTC
Created attachment 142058 [details]
dmesg after reboot
Comment 4 Hadrien Nilsson 2018-10-16 20:28:05 UTC
Created attachment 142059 [details]
Xorg log after reboot and started briefly the game in windowed mode
Comment 5 Hadrien Nilsson 2018-10-16 20:36:57 UTC
I tried to run the game under a Wayland session instead of X11 and the mouse cursor is normal.
Comment 6 Michel Dänzer 2018-10-17 07:25:01 UTC
Does it still happen with xf86-video-amdgpu 18.1.0?

Does amdgpu.dc=0 on the kernel command line avoid the problem?
Comment 7 Hadrien Nilsson 2018-10-17 20:02:11 UTC
amdgpu.dc=0 had no effect, but using xf86-video-amdgpu 18.1.0 indeed fixed the problem :) Thank you Michel.

Hopefully that new version we'll be released somehow for my Linux distribution.

I still do not know if the mouse cursor is displayed as intended (there is a shadow which seems to use additive blending instead of alpha blending) but at least there are no more artifacts.

I guess I should change the Bugzilla Product to "xorg"?
Comment 8 Michel Dänzer 2018-10-18 08:05:17 UTC
(In reply to Hadrien Nilsson from comment #7)
> amdgpu.dc=0 had no effect, but using xf86-video-amdgpu 18.1.0 indeed fixed
> the problem :) Thank you Michel.

Well, that's a surprising result though. Please attach a new Xorg log file from xf86-video-amdgpu 18.1.0.


> I still do not know if the mouse cursor is displayed as intended (there is a
> shadow which seems to use additive blending instead of alpha blending) but
> at least there are no more artifacts.

For comparison, you can disable the HW cursor with

 Option "SWcursor"

in the /etc/X11/xorg.conf Section "Device".
Comment 9 Hadrien Nilsson 2018-10-18 19:07:30 UTC
Created attachment 142083 [details]
Xorg log with amdgpu_drv 18.1.0

I think you're right to be doubtful. I looked at the log and "amdgpu_drv.so" 18.1 was not loaded as it's not ABI compatible with the rest of my Xorg installation. So when the bug disappeared I guess I was running in some kind of fallback mode that does not have the mouse cursor bug :/

Ubuntu 18.10 will be released soon, so maybe I could wait for that version. Xorg code will probably be updated then.
Comment 10 Michel Dänzer 2018-10-19 09:18:26 UTC
Created attachment 142088 [details] [review]
Detect and fix up non-premultiplied cursor  data

(In reply to Hadrien Nilsson from comment #9)
> I looked at the log and "amdgpu_drv.so" 18.1 was not loaded as it's not ABI
> compatible with the rest of my Xorg installation.

That's because the driver was compiled for Xorg 1.20, whereas you're using 1.19. Simply recompiling the newer driver against your local xserver-xorg-dev should overcome this.

Anyway, given the symptoms, the fundamental issue here is probably that the game uses incorrect cursor data without premultiplied alpha.

However, I guess we need to be robust against this. Can you try the attached xf86-video-amdgpu patch?
Comment 11 Hadrien Nilsson 2018-10-24 19:41:46 UTC
Thank you for the patch, Michel.

I spent the last few days trying to understand the situation about color mouse cursors and I think it is rather confusing.

I extracted the mouse cursor asset from Civ6 and I could reproduce the same problem with my SDL test program. When I multiplied the R, G and B components with the Alpha, the blending was correct. So I think you are right about the symptoms.

I tried your patch and the blending was correct in both my SDL program and the game.

Now the question is, what should be the correct behavior? How should the RGBA data be provided, premultiplied or straight? I tried to find specifications, alas:

 - the SDL documentation gives no clue about it;
 - XcursorImage* documentation neither;
 - wl_pointer_set_cursor neither.

I even looked at Windows. No clue there with the exception of what I think is a low level function: DrvSetPointerShape, where finally it gives information about the pixel format: the RGB data should be indeed alpha premultiplied.

As the documentation is not very clear, I had to test implementations. Here are my results with different combinations:

 - Xorg + amdgpu             : premultiplied ok, straight wrong
 - Xorg + intel              : premultiplied ok, straight wrong
 - Xorg + proprietary nVidia : premultiplied ok, straight ok
 - wayland + amdgpu          : premultiplied ok, straight wrong
 - wayland + intel           : premultiplied ok, straight wrong
 - Windows + amdgpu          : premultiplied ok, straight ok
 - Windows + intel           : premultiplied ok, straight ok

So it looks like Windows with any graphics card is able to detect a wrong format (straight alpha) and fix it, as you did in your patch. And also nVidia proprietary X11 drivers have the same workaround. I think it could explain why Aspyr only supports nVidia as nVidia drivers will fix wrong data for them.

I'm very happy your patch fixes Xorg + amdgpu combination, but isn't there a larger problem, where such a workaround should be also done somewhere else in a centralized part of the graphics stack (SDL, Xcursor, wayland?) so any combination could benefit of it?
Comment 12 Alex Deucher 2018-10-24 19:55:31 UTC
X uses pre-multiplied, windows uses non-pre-multiplied (or at least used to, not sure if this changed).  Not all games ported from windows fix this up, do the driver attempts to detect and fix this.
Comment 13 Michel Dänzer 2018-10-25 08:09:15 UTC
Thanks Hadrien for the thorough testing!

AFAIK Linux KMS only supports premultiplied alpha for cursors.

I'll also submit a corresponding change for the xserver repository, which should cover the remaining Linux cases and reduce the overhead of the workaround even with this driver.
Comment 14 Hadrien Nilsson 2018-10-25 18:57:12 UTC
(In reply to Michel Dänzer from comment #13)
> I'll also submit a corresponding change for the xserver repository, which
> should cover the remaining Linux cases and reduce the overhead of the
> workaround even with this driver.

That would be great, thanks Michel.

Regarding Wayland, I think that's a subject for another day and another bug ticket (maybe not even on freedesktop.org? I still need to understand how Wayland compositors are made).
Comment 15 Michel Dänzer 2018-10-26 09:18:17 UTC
(In reply to Hadrien Nilsson from comment #14)
> That would be great, thanks Michel.

Landed already: https://gitlab.freedesktop.org/xorg/xserver/commit/b45c74f0f2868689e7ed695b33e8c60cd378df0b


> Regarding Wayland, I think that's a subject for another day [...]

The above is active in Xwayland as well, so I think it should cover Wayland. Native Wayland apps have to use premultiplied alpha for cursors.
Comment 16 Hadrien Nilsson 2018-10-26 18:01:49 UTC
(In reply to Michel Dänzer from comment #15)
> Landed already:
> https://gitlab.freedesktop.org/xorg/xserver/commit/
> b45c74f0f2868689e7ed695b33e8c60cd378df0b

Amazing :)

> Native Wayland apps have to use premultiplied alpha for cursors.

Thank you for confirming the information. I did not find it in the documentation, hopefully it can be improved.
Comment 17 Michel Dänzer 2018-11-15 15:48:58 UTC
Unfortunately, I had to revert these changes due to bug 108650. Please test https://gitlab.freedesktop.org/daenzer/xf86-video-amdgpu/commit/d4f250aecabd07a524bf6df3d8eb9e00e589dcf0 instead.
Comment 18 Hadrien Nilsson 2018-11-15 21:04:16 UTC
(In reply to Michel Dänzer from comment #17)
> Unfortunately, I had to revert these changes due to bug 108650. Please test
> https://gitlab.freedesktop.org/daenzer/xf86-video-amdgpu/commit/
> d4f250aecabd07a524bf6df3d8eb9e00e589dcf0 instead.

It's sad the situation is so confusing. Maybe a new mouse cursor API could be created that accepts normal alpha, as this is the common way to store it (in PNG files for example)? With a conversion to what the hardware expects, of course. It looks like a lot of developers are unable to correctly use an API that expects premultiplied alpha, with the additional difficulty that the documentation does not say it explicitly.

Regarding Wayland, I opened a ticket to ask for a documentation update, as an attempt to avoid the same confusion we have now with X. Hopefully the documentation maintainer will be sensitive to the problem.

Anyway, I tested your branch (cursor-unpremultiply-overflow-no-gamma) and unfortunately it freezes my system as soon as the Civilization VI cursor is about to be shown.
Comment 19 Michel Dänzer 2018-11-16 08:51:50 UTC
(In reply to Hadrien Nilsson from comment #18)
> Anyway, I tested your branch (cursor-unpremultiply-overflow-no-gamma) and
> unfortunately it freezes my system as soon as the Civilization VI cursor is
> about to be shown.

Weird. Can you still access the system via SSH? Does Xorg die (if so, please get the Xorg log file and/or its stderr output), or enter an infinite loop (if so, can you attach gdb to it and get a backtrace)?
Comment 20 Hadrien Nilsson 2018-11-17 17:31:51 UTC
(In reply to Michel Dänzer from comment #19)
> (In reply to Hadrien Nilsson from comment #18)
> > Anyway, I tested your branch (cursor-unpremultiply-overflow-no-gamma) and
> > unfortunately it freezes my system as soon as the Civilization VI cursor is
> > about to be shown.
> 
> Weird. Can you still access the system via SSH? Does Xorg die (if so, please
> get the Xorg log file and/or its stderr output), or enter an infinite loop
> (if so, can you attach gdb to it and get a backtrace)?

I did some tests today: I can still access to my computer from another one via SSH. Xorg is running at 100% of cpu. I could pause the process with gdb. The process is stuck in drmmode_do_load_cursor_argb(). This loop in particular:

retry:
		for (i = 0; i < cursor_size; i++) {
			argb = image[i];
			if (!drmmode_cursor_pixel(crtc, &argb, &premultiplied,
						  &apply_gamma))
				goto retry;

			ptr[i] = cpu_to_le32(argb);
Comment 21 Michel Dänzer 2018-11-20 09:23:03 UTC
Ah, I made a silly mistake there. :) https://gitlab.freedesktop.org/daenzer/xf86-video-amdgpu/commits/cursor-unpremultiply-overflow-no-gamma updated, please test again.
Comment 22 Hadrien Nilsson 2018-11-20 21:19:08 UTC
It does not freeze anymore :)

I do not get the strange saturated RGB artifacts, however the blending is back to the wrong one, and Gnome night mode does not apply anymore. With the previous patch (the one that unfortunately led to regression in some games), the blending was perfect and Gnome night mode was correctly applied.

I wonder how the nVidia proprietary driver does in these cases.
Comment 23 Michel Dänzer 2018-11-21 09:28:14 UTC
(In reply to Hadrien Nilsson from comment #22)
> I do not get the strange saturated RGB artifacts, however the blending is
> back to the wrong one, and Gnome night mode does not apply anymore.

Yep. The only purpose of this patch is to prevent the artifacts (which occurred due to reading beyond the end of the gamma LUT).

> With the previous patch (the one that unfortunately led to regression in some
> games), the blending was perfect and Gnome night mode was correctly applied.
> 
> I wonder how the nVidia proprietary driver does in these cases.

Does the Civilization VI cursor work correctly with it? If so, maybe it has a quirk database for broken applications. I don't know any other way to work around Civilization VI without breaking other apps at this point.
Comment 24 Michel Dänzer 2018-11-21 09:38:52 UTC
Another possibility might be that the nvidia driver incorrectly interprets all cursor data as non-premultiplied alpha. This would result in translucent parts of most cursor images being displayed slightly incorrectly (more translucent than intended), but this might be less noticeable than non-premultiplied alpha being interpreted as premultiplied alpha (which results in over-bright colours).
Comment 25 Hadrien Nilsson 2018-11-21 18:51:04 UTC
(In reply to Michel Dänzer from comment #23)
> Does the Civilization VI cursor work correctly with it?

The SDL application I wrote gave the same result (a good blending) on an nVidia hardware on Linux I tested on, regardless of the alpha mode.

> If so, maybe it has a quirk database for broken applications.
> I don't know any other way to work around Civilization VI without
> breaking other apps at this point.

Maybe, I heard it is something nVidia does, having specific code paths for specific applications.

It looks like the only thing that can be done at this point is asking for a better documentation in X, Wayland and SDL to warn future developers about these aspects.

I have little hope regarding Aspyr if I ask them to fix their code (they will probably say again that Intel and AMD are not supported).
Comment 26 Michel Dänzer 2018-11-27 11:26:40 UTC
Thanks for the report and testing. The artifacts are fixed in Git master:

commit 13c94a373b4858a2d2aa14c22b5f98d53c84c0d9
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Thu Nov 15 16:40:46 2018 +0100

    Skip gamma correction of cursor data if premultiplied R/G/B > alpha


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.