Bug 73200

Summary: vdpau-GL interop fails due to different screen objects
Product: Mesa Reporter: Matthew Waters <ystreet00>
Component: Drivers/DRI/nouveauAssignee: Maarten Lankhorst <bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: 10.0   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: gdb log
cache dri fd
[PATCH] winsys/nouveau: set nouveau_drm_screen_create as public
ld_preload nouveau_drm_screen_create
cache screen, try 2

Description Matthew Waters 2014-01-01 05:41:46 UTC
Basically I'm trying out the GL_NV_vdpau_interop extension and it's failing with GL_INVALID_OPERATION inside mesa/state_tracker/st_vdpau.c:121 st_vdpau_map_surface() because the GL screen and the vdpau screen are different.

Things I've tried:
 - Straight vdpau/x11 (no GL) - works nicely
 - #if 0 the screen equal check - Fails with 'Kernel rejected pushbuf'.  From this I made the assumption that the screen represents some gpu context and thus the object space that is addressable.  Also what I gathered from the libdrm code.
 - Copying the fd hash table from the radeon_drm_winsys_create() into nouveau_drm_screen_create().  That fails to work because vl_screen_create() and dri2CreateScreen() both create seperate drm fds resulting in different entries in the hash table (that's not the same - see next point)
 - static screen singleton (ignoring subsequent drm fds) however nouveau_drm_screen_create is duplicated in both /usr/lib/vdpau/libvdpau_nouveau.so and /usr/lib/xorg/modules/dri/nouveau_dri.so and thus have different locations.

So all of my attempts to get the screens the same have so far failed and I am not all that familiar with mesa internals to suggest a solution :)

Versions:
$ uname -a
Linux matt-arch 3.12.6-1-ARCH #1 SMP PREEMPT Fri Dec 20 19:39:00 CET 2013 x86_64 GNU/Linux
$ pacman -Si mesa | grep Version
Version        : 10.0.1-1
$ pacman -Si libdrm | grep Version
Version        : 2.4.50-1

Some logs follow.
Comment 1 Matthew Waters 2014-01-01 05:43:03 UTC
Created attachment 91381 [details]
gdb log
Comment 2 Ilia Mirkin 2014-01-01 06:34:55 UTC
It would help if you attached a sample app that shows the failure. I don't think there are any NV_vdpau_interop piglit tests.

Something needs to be done in the src/gallium/targets/*-nouveau/* logic. Radeon (somehow, I haven't traced it all through yet) seems to use auxiliary/vl/vl_winsys_dri.c. Nouveau has a drm_configuration for dri-nouveau but not vdpau-nouveau -- perhaps that needs to be copied over as well? I have no idea what it does, but it has something about sharing fd's in the code -- could be relevant to the problem at hand.

[Frankly I'm not entirely sure how this is supposed to work given megadrivers, but I'm sure it's just a lack of understanding on my part rather than a shortcoming of the system.]
Comment 3 Matthew Waters 2014-01-01 06:56:16 UTC
The only 'app' I have is a gstreamer launch line with gstreamer-vaapi (git master for some serious perf improvements) which looks like this:

LIBVA_DRIVER_NAME=vdpau gst-launch-1.0 playbin uri=file:///path/to/file.avi video-sink='vaapisink use-glx=1'

The use-glx=1 flag is the thing that uses the NV_vdpau_interop path.

or with gst-plugins-gl (still requires gstreamer-vaapi for decoding) from git (fd.o):

LIBVA_DRIVER_NAME=vdpau gst-launch-1.0 playbin uri=file:///path/to/file.avi video-sink='glimagesink'
Comment 4 Emil Velikov 2014-01-25 03:35:52 UTC
AFAIK the whole issue is that nouveau does have 1:1 relation between screen and a context.

For a number of reasons we would like support for multiple contexts per screen, which afaics will allow us (or is a decent starting point) to allow a context to be shared between the vdpau and the gl state trackers.

I may be a bit off with the above assumptions, so don't quote me on that :)
Comment 5 Maarten Lankhorst 2014-02-06 10:56:11 UTC
The screen is hashed based on inode major/minor, if we copy that logic we should be able to do the same.
Comment 6 Maarten Lankhorst 2014-02-06 11:30:57 UTC
Created attachment 93520 [details] [review]
cache dri fd

Based on the radeon version, cache inode major/minor to only create a single screen.

Does this fix things?
Comment 7 Emil Velikov 2014-02-06 16:36:06 UTC
(In reply to comment #6)
> Created attachment 93520 [details] [review] [review]
> cache dri fd
> 
> Based on the radeon version, cache inode major/minor to only create a single
> screen.
> 
> Does this fix things?

Haven't tested it, but noticed a couple of interesting things

* util_hash_table_create can fail
	fd_tab = util_hash_table_create(hash_fd, compare_fd);
+	assert(fd_tab);


* if screen fails, we are at the mercy of the compiler if we null ptr deref or not

+	if (screen)
+		util_hash_table_set(fd_tab, intptr_to_pointer(fd), screen);
+	return &screen->base;

+	if (!screen)
+		return NULL;
+	util_hash_table_set(fd_tab, intptr_to_pointer(fd), screen);
+	return &screen->base;

* nouveau_drm_screen_unref might need a prototype to build.

* drop the fd_tab check in nouveau_drm_screen_unref. it must be non NULL at this point

* fold nouveau_drm_screen_unref uses into nouveau_screen_fini()
Comment 8 Matthew Waters 2014-02-06 20:10:18 UTC
(In reply to comment #6)
> Created attachment 93520 [details] [review] [review]
> cache dri fd
> 
> Based on the radeon version, cache inode major/minor to only create a single
> screen.
> 
> Does this fix things?

No :).

Tracing through, both times that nouveau_drm_screen_create is called, fd_tab is NULL.
Comment 9 Emil Velikov 2014-02-08 05:24:51 UTC
Created attachment 93639 [details] [review]
[PATCH] winsys/nouveau: set nouveau_drm_screen_create as public

(In reply to comment #8)
[snip]
> Tracing through, both times that nouveau_drm_screen_create is called, fd_tab
> is NULL.

Is the nouveau_drm_screen_create func public ? If not give this patch at try on top of Maarten's work.
Comment 10 Matthew Waters 2014-02-08 09:12:24 UTC
(In reply to comment #9)
> Created attachment 93639 [details] [review] [review]
> [PATCH] winsys/nouveau: set nouveau_drm_screen_create as public
> 
> (In reply to comment #8)
> [snip]
> > Tracing through, both times that nouveau_drm_screen_create is called, fd_tab
> > is NULL.
> 
> Is the nouveau_drm_screen_create func public ? If not give this patch at try
> on top of Maarten's work.

Same result, fd_tab is NULL and has different locations the two times it is called.
Comment 11 Matthew Waters 2014-02-08 22:19:28 UTC
Created attachment 93678 [details]
ld_preload nouveau_drm_screen_create

Even LD_PRELOAD ing nouveau_drm_screen_create doesn't work so something else is missing.
Comment 12 Maarten Lankhorst 2014-02-11 10:43:36 UTC
The patch from comment 8 applied on top of my patch should work in theory, if the newly compiled nouveau_dri.so is also used. This is the whole reason for nouveau_drm_screen_create being exported in VDPAU_EXPORTS from src/gallium/targets/vdpau-nouveau. Both libs need to export nouveau_drm_screen_create for this to work.

Looks like some debugging fun is required. :/ Is the correct nouveau_dri.so picked up? You can check with LIBGL_DEBUG=verbose.
Comment 13 Maarten Lankhorst 2014-02-11 11:04:48 UTC
For what it's worth, I performed the same testing and it worked locally with both patches plugged in.
Comment 14 Maarten Lankhorst 2014-02-11 12:43:01 UTC
Created attachment 93859 [details]
cache screen, try 2

This time with --dynamic-list= in nouveau_dri.so to fixup the bug. Can you verify that this  version works?

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.