Created attachment 114987 [details] build: enable va with nouveau driver When running vainfo: libva info: VA-API version 0.37.1 libva info: va_getDriverName() returns 0 libva info: User requested driver 'gallium' libva info: Trying to open /usr/local/lib/dri/gallium_drv_video.so libva info: Found init function __vaDriverInit_0_37 libva error: /usr/local/lib/dri/gallium_drv_video.so init failed libva info: va_openDriver() returns 2 vaInitialize failed with error code 2 (resource allocation failed),exit "vl_screen_create" calls "dd_create_screen" which returns NULL because the block #if defined(GALLIUM_NOUVEAU) if (strcmp(driver_name, "nouveau") == 0) return pipe_nouveau_create_screen(fd); else #endif is not compiled for va target. Indeed GALLIUM_NOUVEAU is not defined at this point. See following patch.
Does it actually work though? (i.e. decode videos, display them, etc)
It didn't last time I've checked. st/va requires chunked decoding which nouveau does not support atm.
basic libva unit tests passes but it fails to decode 2 videos mpeg12 (nouveau_vp3_fill_picparm_mpeg12_bsp) and h264 (nouveau_vp3_fill_picparm_h264_bsp). nouveau_vp3_fill_picparm_mpeg12_bsp (map=0x7ffff7e3c000 <error: Cannot access memory at address 0x7ffff7e3c000>, desc=0x0, dec=0x7fffe004a000) at nouveau_vp3_video_bsp.c:121 121 pic_bsp->picture_structure = desc->picture_structure; (gdb) bt #0 nouveau_vp3_fill_picparm_mpeg12_bsp (map=0x7ffff7e3c000 <error: Cannot access memory at address 0x7ffff7e3c000>, desc=0x0, dec=0x7fffe004a000) at nouveau_vp3_video_bsp.c:121 #1 nouveau_vp3_bsp (dec=dec@entry=0x7fffe004a000, desc=desc@entry=..., target=target@entry=0x7fffe001fe00, comm_seq=comm_seq@entry=2, num_buffers=num_buffers@entry=1, data=data@entry=0x7fffe7ffdcd0, num_bytes=num_bytes@entry=0x7fffe7ffdcc0) at nouveau_vp3_video_bsp.c:257 #2 0x00007ffff072118e in nvc0_decoder_bsp (dec=dec@entry=0x7fffe004a000, desc=desc@entry=..., target=target@entry=0x7fffe001fe00, comm_seq=comm_seq@entry=2, num_buffers=1, data=0x7fffe7ffdcd0, num_bytes=0x7fffe7ffdcc0, vp_caps=vp_caps@entry=0x7fffe7ffdbd8, is_ref=is_ref@entry=0x7fffe7ffdbdc, refs=refs@entry=0x7fffe7ffdbe0) at nvc0/nvc0_video_bsp.c:117 #3 0x00007ffff0721650 in nvc0_decoder_decode_bitstream (decoder=0x7fffe004a000, video_target=0x7fffe001fe00, picture=0x0, num_buffers=<optimised out>, data=<optimised out>, num_bytes=<optimised out>) at nvc0/nvc0_video.c:48 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffe7fff700 (LWP 854)] nouveau_vp3_bsp (dec=dec@entry=0x7fffe0068040, desc=desc@entry=..., target=target@entry=0x7fffe0063110, comm_seq=comm_seq@entry=2, num_buffers=num_buffers@entry=2, data=data@entry=0x7fffe7ffe150, num_bytes=num_bytes@entry=0x7fffe7ffe140) at nouveau_vp3_video_bsp.c:270 270 caps = nouveau_vp3_fill_picparm_h264_bsp(dec, desc.h264, bsp); (gdb) bt #0 nouveau_vp3_bsp (dec=dec@entry=0x7fffe0068040, desc=desc@entry=..., target=target@entry=0x7fffe0063110, comm_seq=comm_seq@entry=2, num_buffers=num_buffers@entry=2, data=data@entry=0x7fffe7ffe150, num_bytes=num_bytes@entry=0x7fffe7ffe140) at nouveau_vp3_video_bsp.c:270 #1 0x00007ffff02a918e in nvc0_decoder_bsp (dec=dec@entry=0x7fffe0068040, desc=desc@entry=..., target=target@entry=0x7fffe0063110, comm_seq=comm_seq@entry=2, num_buffers=2, data=0x7fffe7ffe150, num_bytes=0x7fffe7ffe140, vp_caps=vp_caps@entry=0x7fffe7ffe058, is_ref=is_ref@entry=0x7fffe7ffe05c, refs=refs@entry=0x7fffe7ffe060) at nvc0/nvc0_video_bsp.c:117 #2 0x00007ffff02a9650 in nvc0_decoder_decode_bitstream (decoder=0x7fffe0068040, video_target=0x7fffe0063110, picture=0x0, num_buffers=<optimised out>, data=<optimised out>, num_bytes=<optimised out>) at nvc0/nvc0_video.c:48 Worth to mention that gallium/vdpau succeeds to decode these 2 videos. Also libva succeeds to decode video using libva's vdpau-driver on top of gallium/vdpau.
Seems (almost) identical to the ones I've seen. Iirc the debug build does assert out a bit earlier. If interested you can take a look at adding the missing functionality (to drivers/nouveau) but as-is I'm not sure what nouveau can work with st/va :-\
I am interested but I would need some infos to start :) Also I do not understand why a functionality is missing in drivers/nouveau since st/vdpau works (which means nouveau_vp3_fill_picparm_mpeg12_bsp and nouveau_vp3_fill_picparm_h264_bsp work). So I was suspecting something wrong in st/va/picture.c Using the above bt, I noticed that for st/vpdau http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/vdpau/decode.c#n557 the third param is well defined. Whereas for st/va http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/va/picture.c#n520 the third param is NULL which leads to the crash. Few minutes ago I tried naively to replace this NULL param with "&context->desc.base" but then I can see a wrong image in my application when rendering decoded images.
Unlike public API's, gallium API's are fluid and not extremely well-defined. They're primarily defined by the existing users, and secondarily by documentation when those users disagree. Video API's are especially poorly defined, as there are only 2 real driver implementations (radeon + nouveau), and until recently, only st/xvmc and st/vdpau (and everyone except me hates st/xvmc). I believe that st/va had to make some changes to how functions were called and what parameters they were allowed to have, but the updates were only made to the radeon drivers. Unfortunately I don't know what those updates were. Emil mentioned chunked decoding, but I don't really know what that is. Perhaps you can look at the st/va enablement patches for radeon and see what all they had to change.
Chunked decoding means that between begin picture and end picture the state tracker calls the decode function multiple times for the same picture. E.g. you don't get the bitstream all at once, but rather in small chunks. Nouveau can't deal with that cause Maarten didn't wanted that interface, radeon on the other hand never used anything else. You just need to fix nouveau to implement bitstream buffer resizing in the same way radeon does it (or move that stuff into the state tracker like I always suggested).
Created attachment 117214 [details] [review] WIP - nouveau: add support for chunk decoding A: resize buffer in nvc0_decoder_bsp: http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_video_bsp.c#n73 B: memcpy of the data in nouveau_vp3_bsp: http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c#n294 For radeon driver it both happens in ruvd_decode_bitstream. To compare radeon to nouveau: struct ruvd_decoder.bs_buffers is an array of struct rvid_buffer that hold a ref on struct r600_resource, aka pipe_resource. struct nouveau_vp3_decoder.bsp_bo is an array of struct nouveau_bo (which comes from libdrm/nouveau library) So if I am right, in the function nvc0_decoder_decode_bitstream it should resize the nouveau_bo buffers and store current data, i.e. A and B. So I made a patch that first splits nvc0_decoder_bsp to extract A and B into a new function nvc0_decoder_bsp_next_chunk. The remaining part of nvc0_decoder_bsp go to nvc0_decoder_bsp_end_frame. I just realize the patch is missing the preserving part while resizing buffers (equivalent of the memcpy in rvid_resize_buffer http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/radeon/radeon_video.c#n100) that could explain why it is still not working. But I wanted to know if I am on the right track or not ?
Created attachment 117215 [details] [review] WIP - nouveau: add support for chunk decoding Actual wip patch.
Wouldn't it be simpler to just accumulate user buffers and then do all the work in end_frame instead? And like Christian said, it'd be easy enough to get st/va to do this.
(In reply to Ilia Mirkin from comment #10) > Wouldn't it be simpler to just accumulate user buffers and then do all the > work in end_frame instead? And like Christian said, it'd be easy enough to > get st/va to do this. Accumulating all the user buffers in the state tracker is what I initially suggested as well and I wouldn't mind if it goes into this direction in the long term. You would need to add a new type for the bitstream buffer (we currently already abuse vertext buffers in the OMX encoder for this). For the short term your patch looks like it's going into the right direction.
Created attachment 117358 [details] [review] WIP - nouveau: add support for chunk decoding I updated patch since I miss understood some parts last time. Preserving data while resizing is commented out for testing since the initial buffer is big enough for the stream I tried. The patch keeps st/vdpau working but st/va still does not work. Indeed it shows a green frame. I do not see any shape but I see some pixels that let me think it does something. How can I be sure it is not a problem from st/va ? I assume it works with radeon (only radeon for now ? according to src/gallium/targets/va/Makefile.am I was expecting intel too) For nouveau, maybe it requires to do more work on st/va side. I can try to go for the other solution to compare. Which internal or external unit tests can I use ? Also if you have an idea how I can debug this ? Maybe I can compare what contains the nouveau_bo buffer between one begin/end, between vdpau and va ? Not sure if they should contain the exact same thing at the time _end is called. Thx
You can compare what's going on in the two cases by using valgrind-mmt (http://nouveau.freedesktop.org/wiki/Valgrind-mmt/) in combination with the demmt tool in envytools. However it will be quite tough to tell what's going on since the interesting stuff passed around in buffers rather than sending commands in the fifo. You should still see the buffers being written though.
Created attachment 117399 [details] 2 valgrind outputs to compare failure case with working reference case Failure: LIBVA_DRIVER_NAME=gallium valgrind --tool=mmt --mmt-trace-nouveau-ioctls --log-file=va_mesa_bin.log gst-launch-1.0 filesrc location=/home/SERILOCAL/j.isorce/Downloads/centaur_2.mpg num-buffers=6 ! mpegpsdemux ! mpegvideoparse ! vaapideco^C ! tee ! vdieoconvert ! xvimagesink Success: LIBVA_DRIVER_NAME=vdpau valgrind --tool=mmt --mmt-trace-nouveau-ioctls --log-file=va_over_vdpau_mesa_bin.log gst-launch-1.0 filesrc location=/home/SERILOCAL/j.isorce/Downloads/centaur_2.mpg num-buffers=6 ! mpegpsdemux ! mpegvideoparse ! vaapidecode ! tee ! videoconvert ! xvimagesink Where should I look at in these 2 logs to compare what contain the nouveau_bo buffer between the first st/va being/end frame ? (using demmt -l)
Created attachment 117910 [details] [review] WIP - nouveau: add support for chunk decoding It now works on mpeg12 using vlc+vaapi+mesa/st/va/nouveau. The problem from the last patch was that the "desc" was written to the nouveau_bo before being actually updated. vdpau backend still works, and vaapi backend using vdpau works too. But for now the video is not showing correctly using gstreamer-vaapi so I need to investigate this.
Comment on attachment 117910 [details] [review] WIP - nouveau: add support for chunk decoding Actually it works with GStreamer too using vaapisink. So I am going to submit patches to mailing list. I splited it in 8 patches to make it more clear.
Comment on attachment 114987 [details] build: enable va with nouveau driver Will be submitted to mailing list
Created attachment 117928 [details] [review] [PATCH 0/8] nouveau: add support for vaapi
Created attachment 117929 [details] [review] [PATCH 1/8] nouveau: extract memcpy loop from nouveau_vp3_bsp
Created attachment 117930 [details] [review] [PATCH 2/8] nouveau: remove nouveau_vp3_bsp to use begin/next/end
Please post patches to list so that they may be reviewed.
Created attachment 117931 [details] [review] [PATCH 3/8] nouveau: split nvc0_decoder_bsp in begin/next/end
Created attachment 117932 [details] [review] [PATCH 4/8] nouveau: preserve content buffer when calling nvc0_decoder_bsp_next
Created attachment 117934 [details] [review] [PATCH 5/8] nouveau: remove nvc0_decoder_bsp and use begin/next/end instead
Created attachment 117936 [details] [review] [PATCH 6/8] nvc0: implement pipe_video_codec::begin_frame/end_frame
Created attachment 117937 [details] [review] [PATCH 7/8] nouveau: fix chunk decoding by updating number of slices
Created attachment 117938 [details] [review] [PATCH 8/8] build: enable st/va with nouveau driver
(In reply to Ilia Mirkin from comment #21) > Please post patches to list so that they may be reviewed. I cannot send them using git send-email. I am behind a proxy so I think that's why I cannot. That's why I attached them here. And home computer is broken for now.
Created attachment 117939 [details] Contains files generated with git send-email
Patches have been sent to mailing list for review. vc1, mpeg12, mpeg4 are working but not h264 (visible but lot of blockiness) Need someone to try existing st/va with a radeon card and using: LIBVA_DRIVER_NAME=gallium mpv --hwdec=vaapi anyH264Video.mp4
Fixed in current master.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.