Bug 86837 - kodi segfault since auxiliary/vl: rework the build of the VL code
Summary: kodi segfault since auxiliary/vl: rework the build of the VL code
Status: CLOSED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: mesa-dev
QA Contact:
URL:
Whiteboard:
Keywords: bisected, have-backtrace, regression
: 87054 87769 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-11-28 23:43 UTC by Andy Furniss
Modified: 2016-07-09 18:11 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments
vdpau trace showing errors (9.80 KB, text/plain)
2014-11-29 00:06 UTC, Andy Furniss
Details
build.log (1.58 MB, text/plain)
2015-02-10 20:09 UTC, darkbasic
Details

Description Andy Furniss 2014-11-28 23:43:04 UTC
radeonsi, I am getting a segfault with kodi (xbmc) since -

commit c642e87d9f423c78bf631410e858f675292ba0c4
Author: Emil Velikov <emil.l.velikov@gmail.com>
Date:   Mon Nov 10 18:59:34 2014 +0000

    auxiliary/vl: rework the build of the VL code
    
    Rather than shoving all the VL code for non-VL targets, increasing
    their size, just split it out and use it when needed. This gives us
    the side effect of building vl_winsys_dri.c once, dropping a few
    automake warnings, and reducing the size of the dri modules as below
    
       text    data     bss     dec     hex filename
    5850573  187549 1977928 8016050  7a50b2 before/nouveau_dri.so
    5508486  187100  391240 6086826  5ce0aa after/nouveau_dri.so
    
    The above data is for a nouveau + swrast + kms_swrast 'megadriver'.
    
    v2: Do not include the vl sources in the auxiliary library.
    v3: Rebase. Add nine.



Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbbfff700 (LWP 4123)]
vlVdpVideoMixerRender (mixer=0, background_surface=4294967295, background_source_rect=0x0, current_picture_structure=VDP_VIDEO_MIXER_PICTURE_STRUCTURE_BOTTOM_FIELD, video_surface_past_count=4, video_surface_past=0x7fffbbffec90, 
    video_surface_current=2, video_surface_future_count=2, video_surface_future=0x18, video_source_rect=0x7fffbbffeca0, destination_surface=5, destination_rect=0x7fffbbffecb0, destination_video_rect=0x7fffbbffecb0, layer_count=0, 
    layers=0x0) at mixer.c:254
254        if (vmixer->video_width > video_buffer->width ||
(gdb) bt 
#0  vlVdpVideoMixerRender (mixer=0, background_surface=4294967295, background_source_rect=0x0, current_picture_structure=VDP_VIDEO_MIXER_PICTURE_STRUCTURE_BOTTOM_FIELD, video_surface_past_count=4, video_surface_past=0x7fffbbffec90, 
    video_surface_current=2, video_surface_future_count=2, video_surface_future=0x18, video_source_rect=0x7fffbbffeca0, destination_surface=5, destination_rect=0x7fffbbffecb0, destination_video_rect=0x7fffbbffecb0, layer_count=0, 
    layers=0x0) at mixer.c:254
#1  0x000000000095bca7 in VDPAU::CMixer::ProcessPicture (this=0x0, this@entry=0x7fffbc03f000) at VDPAU.cpp:2523
#2  0x000000000095e583 in VDPAU::CMixer::StateMachine (this=this@entry=0x7fffbc03f000, signal=2, port=port@entry=0x0, msg=0x7fffc00b2110) at VDPAU.cpp:1509
#3  0x000000000095ea9d in VDPAU::CMixer::Process (this=0x7fffbc03f000) at VDPAU.cpp:1615
#4  0x0000000001babadf in CThread::Action (this=0x7fffbc03f000) at Thread.cpp:221
#5  0x0000000001babd89 in CThread::staticThread (data=0x7fffbc03f000) at Thread.cpp:131
#6  0x00007ffff6029d7a in start_thread (arg=0x7fffbbfff700) at pthread_create.c:308
#7  0x00007fffef3a1a9d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
Comment 1 Andy Furniss 2014-11-29 00:06:21 UTC
Created attachment 110209 [details]
vdpau trace showing errors

A trace shows that 'No backend implementation could be loaded.'

mplayer is working though.
Comment 2 Emil Velikov 2014-11-30 09:22:51 UTC
Sigh... sorry for the break Andy. Seems that I should have tested with nv_vdpau_interop apps, which was never really worked with nouveau :(
I'll take a look at this over the next few days.
Comment 3 Christian König 2014-11-30 13:09:55 UTC
Most likely we end up with the implementation with the dummy symbols loaded first (e.g. kodi loads OGL first the VDPAU). This results in a couple of problems when VDPAu tries to query the video capabilities from the screen.

Would it be possible to define the dummy symbols as weak and pointing to zero instead? That might allow overwriting them after loading VDPAU.
Comment 4 Kertesz Laszlo 2014-12-06 22:19:26 UTC
*** Bug 87054 has been marked as a duplicate of this bug. ***
Comment 5 Emil Velikov 2014-12-19 12:58:54 UTC
(In reply to Christian König from comment #3)
> Most likely we end up with the implementation with the dummy symbols loaded
> first (e.g. kodi loads OGL first the VDPAU). This results in a couple of
> problems when VDPAu tries to query the video capabilities from the screen.
> 
> Would it be possible to define the dummy symbols as weak and pointing to
> zero instead? That might allow overwriting them after loading VDPAU.
>
There are two catches with using weak symbols:

1. Both modules must be opened with RTLD_GLOBAL (presently holds true).
2. The VDPAU must be opened/used prior to GL. Otherwise we end up with one of two issues:
 A: DRI module doesn't define the functions -> segfault, as we call function @0
 B: DRI module defines stubs -> the stubs end up used, rather than the ones in VDPAU module.

So based (and the note below) on those I'll revert the previous approach and shove everything but vl_winsys_dri.c back into the DRI modules :'(

On a closer look it seems that we have a _very_ nasty ABI between the DRI and VDPAU modules, which lacks any checks and can explode at any moment.

Would be interesting to see if the nvidia driver uses similar approach or copies the objects back and forth between the two contexts, keeping them in sync. An interesting TODO for rainy days :)
Comment 6 Emil Velikov 2014-12-19 13:03:53 UTC
Christian,

Are you concerned about the hidden and unchecked ABI mentioned - can you see an easy way around it ?
Comment 7 Emil Velikov 2014-12-19 14:05:15 UTC
Guys can you confirm that the patch posted brings back gl-vdpau interop :)
Comment 8 Christian König 2014-12-19 14:10:43 UTC
(In reply to Emil Velikov from comment #6)
> Christian,
> 
> Are you concerned about the hidden and unchecked ABI mentioned - can you see
> an easy way around it ?

Not really. To be honest I don't have much experience with such kind of linker hacks.

But as far as I know if we use weak symbols it shouldn't matter if VDPAU or OGL is loaded first. The strong symbols in the VDPAU module should overwrite the weak ones in the DRI module.

As for letting the two interfaces work together that obviously only works if you use the same version of the gallium interface for both.
Comment 9 Emil Velikov 2014-12-20 11:25:09 UTC
(In reply to Christian König from comment #8)
> (In reply to Emil Velikov from comment #6)
> > Christian,
> > 
> > Are you concerned about the hidden and unchecked ABI mentioned - can you see
> > an easy way around it ?
> 
> Not really. To be honest I don't have much experience with such kind of
> linker hacks.
> 
> But as far as I know if we use weak symbols it shouldn't matter if VDPAU or
> OGL is loaded first. The strong symbols in the VDPAU module should overwrite
> the weak ones in the DRI module.
> 
Unfortunately it does not seem to work. Check out this example project [1], and toggle the USE_FOO_DRI_FIRST at [2].

> As for letting the two interfaces work together that obviously only works if
> you use the same version of the gallium interface for both.
>
True, but we currently do not check if there is a miss-match. Thus my "hidden and unchecked" comment earlier.

[1] https://github.com/evelikov/weaksymbols
[2] https://github.com/evelikov/weaksymbols/blob/master/Makefile.am#L6
Comment 10 bgunteriv 2014-12-27 16:15:33 UTC
please let me know what logs you need to test this error.
I'm willing to help to get this fixed and back on track.
Comment 11 smoki 2014-12-27 18:15:10 UTC
*** Bug 87769 has been marked as a duplicate of this bug. ***
Comment 12 Emil Velikov 2014-12-29 22:23:01 UTC
Seems like Christian dropped the link with the tentative fix.
http://patchwork.freedesktop.org/patch/39400/

Guys can you test this please ?

Christian, kindly tell me your opinion to the proposal in the patch thread.
Comment 13 Andy Furniss 2014-12-30 14:52:37 UTC
(In reply to Emil Velikov from comment #12)
> Seems like Christian dropped the link with the tentative fix.
> http://patchwork.freedesktop.org/patch/39400/
> 
> Guys can you test this please ?

Works OK for me testing with Kodi.
Comment 14 smoki 2014-12-30 21:20:26 UTC
(In reply to Emil Velikov from comment #12)
> Seems like Christian dropped the link with the tentative fix.
> http://patchwork.freedesktop.org/patch/39400/
> 
> Guys can you test this please ?
>

 Works OK for me too with mpv. Bug 87769
Comment 15 John 2015-01-02 21:33:08 UTC
I can confirm this patch allows me to be back on git as well.
Since November or so I've been unable to use vdpau with mesa-git, but with this patch it works.
Comment 16 bgunteriv 2015-01-04 19:33:51 UTC
I can also confirm that this patch works.

Running OpenGL version:2.1 Mesa 10.4.0(git-fb3f7c0)

great work!
Comment 17 bgunteriv 2015-01-04 20:05:13 UTC
(In reply to Andy Furniss from comment #13)
> (In reply to Emil Velikov from comment #12)
> > Seems like Christian dropped the link with the tentative fix.
> > http://patchwork.freedesktop.org/patch/39400/
> > 
> > Guys can you test this please ?
> 
> Works OK for me testing with Kodi.

I spoke too soon.
When trying to build for Kodi, my display works, but my videos do not play.

@Andy Furniss, what is your command line for building mesa?
I'm using this -- ./autogen.sh --prefix=/opt/xorg --with-gallium-drivers=r600 --with-dri-drivers=radeon --enable-glx-tls


am I missing something?
Comment 18 bgunteriv 2015-01-05 01:35:39 UTC
(In reply to bgunteriv from comment #17)
> (In reply to Andy Furniss from comment #13)
> > (In reply to Emil Velikov from comment #12)
> > > Seems like Christian dropped the link with the tentative fix.
> > > http://patchwork.freedesktop.org/patch/39400/
> > > 
> > > Guys can you test this please ?
> > 
> > Works OK for me testing with Kodi.
> 
> I spoke too soon.
> When trying to build for Kodi, my display works, but my videos do not play.
> 
> @Andy Furniss, what is your command line for building mesa?
> I'm using this -- ./autogen.sh --prefix=/opt/xorg
> --with-gallium-drivers=r600 --with-dri-drivers=radeon --enable-glx-tls
> 
> 
> am I missing something?

okay, realizing that maybe I don't have the patches included previous to this for Kodi.
Comment 19 Andy Furniss 2015-01-05 11:41:58 UTC
(In reply to bgunteriv from comment #17)
> (In reply to Andy Furniss from comment #13)
> > (In reply to Emil Velikov from comment #12)
> > > Seems like Christian dropped the link with the tentative fix.
> > > http://patchwork.freedesktop.org/patch/39400/
> > > 
> > > Guys can you test this please ?
> > 
> > Works OK for me testing with Kodi.
> 
> I spoke too soon.
> When trying to build for Kodi, my display works, but my videos do not play.
> 
> @Andy Furniss, what is your command line for building mesa?
> I'm using this -- ./autogen.sh --prefix=/opt/xorg
> --with-gallium-drivers=r600 --with-dri-drivers=radeon --enable-glx-tls
> 
> 
> am I missing something?

I see it may be kodi related  - I was testing with git.

My build is for radeonsi so looks different anyway.

I don't know if your prefix will mean any extra setup needed for vdpau, possibly not - it's a long time since I installed/prefix other than /usr. 

Haven't retested recently but one thing that I need(ed) on si is --sysconfdir=/etc as for si there are some gpu lockup saving app specific lines in /etc/drirc and it doesn't (or didn't) get read from $prefix/etc which is where it will go by default. 

I also leave --with-dri-drivers=
just like above to build nothing.
Comment 20 Andy Furniss 2015-01-05 11:46:21 UTC
(In reply to Andy Furniss from comment #19)
> (In reply to bgunteriv from comment #17)

> > @Andy Furniss, what is your command line for building mesa?

Forgot to put I also have 

--enable-texture-float

Not that it's likely to affect this issue anyway.
Comment 21 Aaron Watry 2015-01-05 15:33:54 UTC
(In reply to bgunteriv from comment #17)
> (In reply to Andy Furniss from comment #13)
> > (In reply to Emil Velikov from comment #12)
> > > Seems like Christian dropped the link with the tentative fix.
> > > http://patchwork.freedesktop.org/patch/39400/
> > > 
> > > Guys can you test this please ?
> > 
> > Works OK for me testing with Kodi.
> 
> I spoke too soon.
> When trying to build for Kodi, my display works, but my videos do not play.
> 
> @Andy Furniss, what is your command line for building mesa?
> I'm using this -- ./autogen.sh --prefix=/opt/xorg
> --with-gallium-drivers=r600 --with-dri-drivers=radeon --enable-glx-tls
> 
> 
> am I missing something?

Where is your libvdpau.so installed?  The vdpau drivers need to be in a subdirectory named vdpau under the same directory that the libvdpau.so is installed in.

If libvdpau.so is in /usr/lib/x86_64-linux-gnu, and you've got your drivers installed to /opt/xorg, then copy /opt/xorg/lib/vdpau/* to /usr/lib/x86_64-linux-gnu/vdpau/.

You may want to use something like smplayer to make sure that vdpau playback works before running something full-screen like kodi...

LIBGL_DEBUG=verbose smplayer

Then make sure that you're configured to use vdpau for playback, and try to play a file.  Then view the mplayer log (in the menus), and see if there's any errors about not being able to find a vdpau driver.
Comment 22 bgunteriv 2015-01-05 18:05:50 UTC
(In reply to Aaron Watry from comment #21)
> (In reply to bgunteriv from comment #17)
> > (In reply to Andy Furniss from comment #13)
> > > (In reply to Emil Velikov from comment #12)
> > > > Seems like Christian dropped the link with the tentative fix.
> > > > http://patchwork.freedesktop.org/patch/39400/
> > > > 
> > > > Guys can you test this please ?
> > > 
> > > Works OK for me testing with Kodi.
> > 
> > I spoke too soon.
> > When trying to build for Kodi, my display works, but my videos do not play.
> > 
> > @Andy Furniss, what is your command line for building mesa?
> > I'm using this -- ./autogen.sh --prefix=/opt/xorg
> > --with-gallium-drivers=r600 --with-dri-drivers=radeon --enable-glx-tls
> > 
> > 
> > am I missing something?
> 
> Where is your libvdpau.so installed?  The vdpau drivers need to be in a
> subdirectory named vdpau under the same directory that the libvdpau.so is
> installed in.
> 
> If libvdpau.so is in /usr/lib/x86_64-linux-gnu, and you've got your drivers
> installed to /opt/xorg, then copy /opt/xorg/lib/vdpau/* to
> /usr/lib/x86_64-linux-gnu/vdpau/.
> 

that did it!
working properly now, with the patch.

thanks @Aaron Watry.
Comment 23 Jose P. 2015-01-14 00:14:01 UTC
Any news on this?
Comment 24 Jeff 2015-02-05 08:50:14 UTC
Will this be making it into a future Mesa release?
Comment 25 Emil Velikov 2015-02-10 16:30:15 UTC
Hi guys, sorry about the delay.

Christian had some comments on the original patch which I've addressed here - http://patchwork.freedesktop.org/patch/42201/ Can you give it a spin ?
Comment 26 Andy Furniss 2015-02-10 18:07:18 UTC
(In reply to Emil Velikov from comment #25)
> Hi guys, sorry about the delay.
> 
> Christian had some comments on the original patch which I've addressed here
> - http://patchwork.freedesktop.org/patch/42201/ Can you give it a spin ?

Works for me.
Comment 27 darkbasic 2015-02-10 20:09:52 UTC
Created attachment 113327 [details]
build.log

It doesn't build anymore with your latest patch (Mesa-dev-4-4-auxiliary-vl-bring-back-the-VL-code-for-the-dri-targets.patch).
With the old "Mesa-dev-auxiliary-vl-move-most-sources-back-into-the-aux-library.patch" it still compiles and work flawlessly.
Comment 28 John 2015-02-10 21:45:17 UTC
(In reply to Emil Velikov from comment #25)
> Hi guys, sorry about the delay.
> 
> Christian had some comments on the original patch which I've addressed here
> - http://patchwork.freedesktop.org/patch/42201/ Can you give it a spin ?

I just tried building mesa-git with it.
It breaks building openmax:

../../../../src/gallium/state_trackers/omx/.libs/libomxtracker.a(entrypoint.o): In function `omx_get_screen':
entrypoint.c:(.text+0x92): undefined reference to `vl_screen_create'
../../../../src/gallium/state_trackers/omx/.libs/libomxtracker.a(entrypoint.o): In function `omx_put_screen':
entrypoint.c:(.text+0xf1): undefined reference to `vl_screen_destroy'

but if I disable that state tracker, then I have no issue building and vdpau works fine as with the previous patch.

So... almost there I guess.
Comment 29 Emil Velikov 2015-02-10 22:28:54 UTC
Forgot to update the omx target, silly me - v2 of the patch should handle that
http://patchwork.freedesktop.org/patch/42260/
Comment 30 John 2015-02-10 23:15:24 UTC
(In reply to Emil Velikov from comment #29)
> Forgot to update the omx target, silly me - v2 of the patch should handle
> that
> http://patchwork.freedesktop.org/patch/42260/

That works!
Thanks for the quick v2.
Comment 31 darkbasic 2015-02-10 23:28:49 UTC
Works fine here, thanks.
Comment 32 Emil Velikov 2015-02-12 20:45:54 UTC
commit c39dbfdd0f764b1aaa7319b4694e7335692993dd
Author: Emil Velikov <emil.l.velikov@gmail.com>
Date:   Tue Feb 10 15:11:09 2015 +0000

    auxiliary/vl: bring back the VL code for the dri targets

The patch has landed in master + 10.5(rc1). Thanks guys for testing and sorry again for breaking it.
Comment 33 Christian König 2016-07-09 18:11:24 UTC
Already fixed for quite a while.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.