Bug 110783

Summary: Mesa 19.1 rc crashing MPV with VAAPI
Product: Mesa Reporter: AngryPenguin <angrypenguinpoland>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact: Default DRI bug account <dri-devel>
Severity: major    
Priority: medium CC: alexander, bero, egorov_egor, gw.fossdev, mail, orzel
Version: 19.1   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
URL: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1084
Whiteboard:
i915 platform: i915 features:
Attachments: xorg
inxi

Description AngryPenguin 2019-05-28 13:11:16 UTC
Created attachment 144362 [details]
xorg

Hi.

I noticed that after the Mesa upgrade from stable 19.0X to 19.1rc when trying to play MPV video with VAAPI it crashing MPV with this error:

Playing: /VID_20150204_135614.m4v
 (+) Video --vid=1 (*) (mpeg4 17.570fps)
 (+) Audio --aid=1 --alang=eng (*) (amr_nb 1ch 8000Hz)
AO: [pulse] 8000Hz mono 1ch float
VO: [vdpau] 640x480 yuv420p
[vo/vdpau] Compositing window manager detected. Assuming timing info is inaccurate.
EE ../src/gallium/drivers/r600/r600_shader.c:4290 tgsi_unsupported - DIV tgsi opcode unsupported
EE ../src/gallium/drivers/r600/r600_shader.c:185 r600_pipe_shader_create - translation from TGSI failed !
EE ../src/gallium/drivers/r600/r600_state_common.c:879 r600_shader_select - Failed to build shader variant (type=5) -22
EE ../src/gallium/drivers/r600/r600_shader.c:4290 tgsi_unsupported - DIV tgsi opcode unsupported
EE ../src/gallium/drivers/r600/r600_shader.c:185 r600_pipe_shader_create - translation from TGSI failed !
EE ../src/gallium/drivers/r600/r600_state_common.c:879 r600_shader_select - Failed to build shader variant (type=5) -22

Violation of memory protection (memory dump) <- (last line translated)

It worked fine on Mesa 19.0X, it began crash after upgrade to Mesa 19.1rc2 and 19.1rc3.
Trying libva 2.4.0 and also libva 2.4.1 - same results.

MPV crashing at default launch options. Crashing also with --vo=vaapi or --vo=vdpau.
Not crashing with --vo=gpu or --vo=xv

Worth to add other video players like VLC not crashing wit VAAPI. Only MPV.

Details:
Linux OpenMandriva Lx 4 x86_64
Kernel 5.1.5
Mesa 19.1rc2/3
libva 2.4.0/2.4.1

Tested on two computers with R600, one PC with Radeon HD5850 and notebook with Radeon HD5650m.

In attachment output of xorg log.
Comment 1 AngryPenguin 2019-05-28 13:11:55 UTC
Created attachment 144363 [details]
inxi
Comment 2 Matt Turner 2019-06-10 21:47:15 UTC
Looks like this was reported as a Gentoo bug as well (https://bugs.gentoo.org/686252)

Have you tried bisecting?
Comment 3 Gert Wollny 2019-06-11 07:06:58 UTC
R600 doesn't implement TGSI_OPCODE_DIV and in the Gentoo bug this is the one reported as being triggered. If think for glsl this is lowered, so maybe there is some compiler option missing in the vdpau state tracker.
Comment 4 Gert Wollny 2019-06-12 14:24:08 UTC
@AngryPenguin A closer look shows that the bicubic filter in gallium/auxiliar/vl issues TGSI code that contains a DIV operation. 

Could you try this tree: 

https://gitlab.freedesktop.org/gerddie/mesa/tree/vl-fix-DIV

You can also just apply the last commit, thanks.
Comment 5 Gert Wollny 2019-06-12 15:01:02 UTC
No this doesn't fix the bug, there are other instances where a DIV is introduced.
Comment 6 Gert Wollny 2019-06-12 15:16:54 UTC
The commit that added TGSI shaders with DIV were introduced with 
  f6ac0b5d7187
   gallium/auxiliary/vl: Add compute shader to support video compositor render

and the use of the shaders was enabled with 
  9364d66cb7f7
    gallium/auxiliary/vl: Add video compositor compute shader render

The simplest approach is probably to add the lowering to RCP + MUL in the GLSL-TO-TGSI stage.
Comment 7 Gert Wollny 2019-06-12 17:25:15 UTC
This is a very deep rabbit hole: Not only does r600 not support DIV, it also doesn't support TEX_LZ that is used by these compute shaders and Evergreen class hardware doesn't support more then one target swizzle for the destinations with RCP  so that the shader is even more broken. I think the best option now will be to disable this shader for now for this hardware.
Comment 8 Ilia Mirkin 2019-06-12 17:31:24 UTC
(In reply to Gert Wollny from comment #7)
> This is a very deep rabbit hole: Not only does r600 not support DIV, it also
> doesn't support TEX_LZ that is used by these compute shaders and Evergreen
> class hardware doesn't support more then one target swizzle for the
> destinations with RCP  so that the shader is even more broken. I think the
> best option now will be to disable this shader for now for this hardware.

The state tracker has to respect PIPE_CAP's. If the driver doesn't say it has DIV or TEX_LZ, then those shouldn't be used.
Comment 9 Gert Wollny 2019-06-12 18:42:31 UTC
Indeed, currently the code only tests whether compute shaders are supported, and DIV and TEX_LZ don't have any caps yet. I guess I'll take in on me to add these caps.
Comment 10 Christian König 2019-06-12 19:31:29 UTC
(In reply to Gert Wollny from comment #9)
> Indeed, currently the code only tests whether compute shaders are supported,
> and DIV and TEX_LZ don't have any caps yet. I guess I'll take in on me to
> add these caps.

Well the key point is probably rather that we should not use compute shaders on that hw generation in the first place.

How did that got enabled?
Comment 11 Ilia Mirkin 2019-06-13 02:22:52 UTC
(In reply to Christian König from comment #10)
> (In reply to Gert Wollny from comment #9)
> > Indeed, currently the code only tests whether compute shaders are supported,
> > and DIV and TEX_LZ don't have any caps yet. I guess I'll take in on me to
> > add these caps.
> 
> Well the key point is probably rather that we should not use compute shaders
> on that hw generation in the first place.
> 
> How did that got enabled?

It just checks for PIPE_CAP_COMPUTE (which *is* supported by r600). The change was targeted at radeonsi. I had to disable PIPE_CAP_COMPUTE on nv50 to work around this as well, but there it wasn't _really_ supported very well so it was OK.
Comment 12 Gert Wollny 2019-06-13 07:35:35 UTC
I might add that the DIV is lowered in glsl to RCP+MUL before it is translated to TGSI, so no need for it there. 

When I look at the bicubic shader with the offending opcodes, I have to say that using DIV there is a bit lazy, because the DIVs act on constants or uniforms or combinations of these, and it would probably be better to pass them reciprocal value in (I just don't know (yet) where the constants are actually passed in to change this).

Anyway, since there is already a CAP for TEX_LZ I was able to create a simple fix: 
   https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1084
Comment 13 Ilia Mirkin 2019-06-13 15:46:21 UTC
(In reply to Gert Wollny from comment #12)
> Anyway, since there is already a CAP for TEX_LZ I was able to create a
> simple fix: 
>    https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1084

TEX_LZ == TXL with LOD = 0. You could rather easily make the tgsi code that generates TEX_LZ to optionally use TXL with LOD = 0 (which has to go into coord.w).
Comment 14 Gert Wollny 2019-06-13 16:25:28 UTC
Well, I already pointed out that the fix is by no means complete, because even if I provide a TGSI that doesn't use TEX_LZ, I still have to take care of the DIV, which is a bit more tedious and there is currently no CAP.
Comment 15 AngryPenguin 2019-06-14 14:28:50 UTC
Hi.

I can confirm, this fix my issue with vaapi and vdpau.

Thanks.
Comment 16 Gert Wollny 2019-06-14 15:15:34 UTC
I've updated the MR to add a CAP for support of TGSI_OPCODE_DIV to also check for it.
Comment 17 i.kalvachev 2019-06-26 17:58:10 UTC
My distribution just upgraded to Mesa-19.1.1 release and I hit this bug.

I can't believe the fix has been forgotten.

Please, push it ASAP.
Comment 18 Thomas Capricelli 2019-06-28 13:39:57 UTC
Just updated to 19.1.1 here, and the bug is still present, can't play mpv with vdpau driver.
Comment 19 ITwrx 2019-07-08 16:10:24 UTC
this bug has halted the development of a project i am working on. crashes the program.

arch linux 5.1.16.a-1-hardened
libva 2.4.1-1
libva-mesa-driver 19.1.1-1
libva-utils 2.4.1-1
mesa 19.1.1-1

i have a Radeon HD 5830

thanks
Comment 20 Juan A. Suarez 2019-07-09 09:38:41 UTC
The fix has landed in 19.1.2 release.

Can you try it again?
Comment 21 ITwrx 2019-07-09 13:14:36 UTC
@juan sure, but i'll probably wait long enough for arch to package the new version. Thanks for letting me know.
Comment 22 AngryPenguin 2019-07-11 12:58:24 UTC
I can confirm. 19.1.2 fixed my issue on OpenMandirva.
Thanks
Comment 23 i.kalvachev 2019-07-11 21:25:23 UTC
I can confirm that mesa-19.1.2 works for me too.
Comment 24 ITwrx 2019-07-13 06:17:59 UTC
19.1.2 did the trick. thanks

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.