Bug 109330

Summary: GL_ALPHA_BITS set to non-zero with EGL_PLATFORM_GBM_MESA
Product: Mesa Reporter: emersion <contact>
Component: Drivers/DRI/nouveauAssignee: Nouveau Project <nouveau>
Status: CLOSED NOTOURBUG QA Contact: Nouveau Project <nouveau>
Severity: normal    
Priority: medium CC: bigras.bruno, wes
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description emersion 2019-01-12 13:59:45 UTC
When creating an EGL context with EGL_PLATFORM_GBM_MESA and EGL_ALPHA_SIZE=1, Nouveau produces completely transparent images when using glReadPixels. Note that my code works with other drivers (i915, amdgpu), only Nouveau is affected.

How to reproduce:

1. Create an EGL context with these attribs:

static const EGLint config_attribs[] = {
	EGL_RED_SIZE, 1,
	EGL_GREEN_SIZE, 1,
	EGL_BLUE_SIZE, 1,
	EGL_ALPHA_SIZE, 1,
	EGL_NONE,
};

Set platform to EGL_PLATFORM_GBM_MESA and visual to GBM_FORMAT_ARGB8888.

2. Try to get the alpha size:

eglGetConfigAttrib(EGL_ALPHA_SIZE)
glGetIntegerv(GL_ALPHA_BITS)

Both return 8, meaning an alpha channel is present.

3. Try to read pixels with glReadPixels and GL_BGRA_EXT. The resulting image has an alpha channel which sets the pixels to be completely transparent. Editing the image with gimp reveals that inverting the alpha channel makes the image correct.

Software is wlroots [1] with grim [2]. To reproduce, compile both, start rootston, run grim. Let me know if you need more info.

[1]: https://github.com/swaywm/wlroots
[2]: https://github.com/emersion/grim
Comment 1 Ilia Mirkin 2019-01-13 00:29:28 UTC
A few questions:

1. Which hardware was this tested on? (lspci -nn -d 10de: should produce the requisite info)
2. What mesa version
3. Is glReadPixels reading into a PBO or a client-side buffer?
Comment 2 emersion 2019-01-13 02:30:43 UTC
The downstream bug is at [1].

1. So far I've received bug reports for these cards:

NVIDIA Quadro FX 580
NVIDIA GT218M NVS 3100M
NVIDIA GT216M GeForce GT 320M

2. Mesa 18.2.6

3. glReadPixels is reading into a client-side buffer

[1]: https://github.com/swaywm/wlroots/issues/1438
Comment 3 Wesley Moore 2019-01-13 03:14:43 UTC
I'm seeing this issue on the following hardware too:

NVIDIA Corporation GP107 [GeForce GTX 1050] [10de:1c81] (rev a1)
Comment 4 Ilia Mirkin 2019-01-13 16:17:12 UTC
OK, so the 3 GPUs listed by emersion are all nv50, while the GP107 would be covered by the nvc0 driver backend. Unfortunately I have a nv42 plugged in ATM, but I'll have a look when I'm back on something a bit more modern.

If someone with C/C++ development experience and the requisite hardware is interested, feel free to join in #nouveau on irc.freenode.net and I can provide advice for things to investigate.

Lastly, if someone can provide an apitrace of the application doing glReadPixels and getting unexpected results, that would avoid the need to have to reproduce the software setup. I believe it's possible to look at glReadPixels results in qapitrace ... would have to double-check though.

One guess is that we're copying from a RGBX surface to a RGBA surface, and not whacking alpha to 1. However that should generally work, so one would have to identify the precise code-path followed whereby that does not work.

Another guess is that the result of the rendering really does produce an alpha of 0, either due to an earlier bug (unrelated to glReadPixels) happening, or due to something specific to the drm formats exposed by nouveau and/or order of gl configs, which causes something funny to happen..
Comment 5 Ilia Mirkin 2019-01-21 00:30:26 UTC
As per my comments on IRC, I think this is a wlroots bug: drm_connector_set_mode calls init_drm_plane_surfaces (which in turn creates a gbm surface) with GBM_FORMAT_XRGB8888. Flipping that to ARGB8888 resolves the issue.

As per my understanding, the config has to match the surface format, otherwise you get stuff like this. (At the extreme, imagine one said RGB10A2.)

The more detailed issue is that the winsys fb calls dri2_drm_image_get_buffers (via the getBuffers loader API call), which in turn uses the surface's format to create a backing bo. That format is XRGB8888, which becomes PIPE_FORMAT_BGRX8888_UNORM. The pipe_surface also has this format. However the GL believes that the format should be MESA_FORMAT_BGRA8888 (since that's what the config said), and since that matches the glReadPixels format, decides it can just do a memcpy.

Perhaps the pipe_surface should ignore the resource's format in this case, but that's not how the current logic flows. In st_framebuffer_validate, it calls u_surface_default_template which copies the format out of the given resource. AFAIK having mismatches between surface format and config format is really bad though, and not really supported. But I'm not a gbm API expert.
Comment 6 Daniel Stone 2019-01-22 13:15:46 UTC
Yeah, I think it's reasonable. The render target is XRGB8888, and X really does mean undefined, including 'throw content away on render' and 'invent fake content on sample'.

I'd suggest allocating the gbm_surface as GBM_SURFACE_ARGB8888, then taking those BOs and passing DRM_FORMAT_XRGB8888 to AddFB2. This is totally safe to do, and will ensure that drivers preserve the alpha channel when you write out.

(I was going to say about matching EGL_NATIVE_VISUAL_ID as well given your short example, but you already do that properly - nice!)
Comment 7 emersion 2019-01-22 13:19:13 UTC
Thanks for the explanation, this makes sense. We'll fix this in wlroots.

Again, sorry for filling an invalid bug!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.