Bug 99029 - VCE VAAPI segfault using ffmpeg
Summary: VCE VAAPI segfault using ffmpeg
Status: NEW
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r600 (show other bugs)
Version: 17.1
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Default DRI bug account
QA Contact: Default DRI bug account
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-12-08 14:37 UTC by Martin Bednar
Modified: 2017-05-11 16:32 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Bednar 2016-12-08 14:37:06 UTC
lspci:00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Trinity [Radeon HD 7480D]

libva info: Trying to open /usr/lib64/va/drivers/r600_drv_video.so
libva info: Found init function __vaDriverInit_0_39
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.39 (libva 1.7.3)
vainfo: Driver version: mesa gallium vaapi
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileNone                   : VAEntrypointVideoProc

commandline:
ffmpeg -vaapi_device /dev/dri/renderD128 -i Elephants_Dream_HD.avi -vf format=rgba,hwupload -bf 0 -c:v h264_vaapi test.mkv
run on a console over ssh.

backtrace:
#0  0x00007f4004a58e3f in create (enc=0x19ed530) at /var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/drivers/radeon/radeon_vce_40_2_2.c:98
#1  0x00007f4004a5cd3f in rvce_begin_frame (encoder=0x19ed530, source=0x16e1a70, picture=0x19131d8) at /var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/drivers/radeon/radeon_vce.c:288
#2  0x00007f400490cdec in vlVaEndPicture (ctx=<optimized out>, context_id=23) at /var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/va/picture.c:572
#3  0x00007f401cfcf08f in vaEndPicture (dpy=0x1691c20, context=23) at /var/tmp/portage/x11-libs/libva-1.7.3/work/libva-1.7.3/va/va.c:1232
#4  0x00007f401e202a9b in vaapi_encode_issue (avctx=avctx@entry=0x16e5060, pic=pic@entry=0x1745500) at src/libavcodec/vaapi_encode.c:387
#5  0x00007f401e202c16 in vaapi_encode_step (avctx=avctx@entry=0x16e5060, target=target@entry=0x1745500) at src/libavcodec/vaapi_encode.c:587
#6  0x00007f401e202fe7 in ff_vaapi_encode2 (avctx=0x16e5060, pkt=0x190fa60, input_image=<optimized out>, got_packet=0x7ffc264290d4) at src/libavcodec/vaapi_encode.c:867
#7  0x00007f401e1f6d15 in avcodec_encode_video2 (avctx=avctx@entry=0x16e5060, avpkt=0x190fa60, frame=frame@entry=0x19ad200, got_packet_ptr=got_packet_ptr@entry=0x7ffc264290d4) at src/libavcodec/utils.c:1994
#8  0x00007f401e1f6fda in do_encode (avctx=0x16e5060, frame=0x19ad200, got_packet=0x7ffc264290d4) at src/libavcodec/utils.c:2939
#9  0x00007f401e1fc167 in avcodec_send_frame (avctx=avctx@entry=0x16e5060, frame=0x19ad200) at src/libavcodec/utils.c:2988
#10 0x0000000000420fdd in do_video_out (of=of@entry=0x1745be0, ost=ost@entry=0x16e4e00, next_picture=next_picture@entry=0x19ad200, sync_ipts=<optimized out>, sync_ipts@entry=-7.62939453125e-06) at src/ffmpeg.c:1251
#11 0x000000000042266f in reap_filters (flush=flush@entry=0) at src/ffmpeg.c:1451
#12 0x0000000000409316 in transcode_step () at src/ffmpeg.c:4343
#13 transcode () at src/ffmpeg.c:4387
#14 main (argc=<optimized out>, argv=<optimized out>) at src/ffmpeg.c:4592

Testing patches is no problem.
Comment 1 Martin Bednar 2016-12-08 14:54:56 UTC
Also dmesg |grep VCE: 

[drm] Found VCE firmware/feedback version 50.0.1 / 17!
[drm] VCE initialized successfully.
Comment 2 Martin Bednar 2017-05-09 14:15:22 UTC
After fixing https://bugs.freedesktop.org/show_bug.cgi?id=100972 , 
ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -i Elephants_Dream_HD.avi -vf format=yuv420p,hwupload -threads 2 -acodec copy  -vaapi_device /dev/dri/renderD128  -bf 0 -c:v h264_vaapi -c:v h264_vaapi -profile:v 77 ~/test.mkv

seems to work (ffmpeg from git master).
Performance is atrocious though : I get about an encoding rate of about 2 FPS.
Comment 3 Andy Furniss 2017-05-09 15:40:56 UTC
Hmm, I haven't tested yet, but if you hit the division by zero, I guess there is something special about thet file (or ffmpeg/something changed since I last looked). Before reading your other bug the only way I thought would trigger that one was to explicitly add -g 0 to the command line.

On speed - I would get rid of the yuv420p, maybe add -g 48 as for some reason things don't seem quite right with the gop, or you shouldn't hit the division by zero.

Your fix seems to change the intent of that code a little bit - I don't know if it makes any difference. That code got added to correct some corner case rate control cbr issue - ffmpeg switched to using vbr by default so may not show it anyway. I don't recall what rate (maybe it's cqp) you'll get if you don't ask for anything like your example. There may be a better place to check if gop is 0.

-profile:v 77 doesn't quite work properly IIRC (well it doesn't change anything encoding wise but the file will not show as main).

You may get more perf if you force your GPU to high.

I guess this is just a test, but for real world use, using your h/w to transcode without b-frames is not really an optimum solution and it's up to the firmware team whether b-frames ever work (They don't work on windows either AFAICT).
Comment 4 Martin Bednar 2017-05-11 13:40:42 UTC
The file is the OSS Elephant's dream movie : https://orange.blender.org/
Purposely tested with this for easy sharing.
Contained streams (ffmpeg -i ): 
    Stream #0:0: Video: msmpeg4v2 (MP42 / 0x3234504D), yuv420p, 1920x1080, 10002 kb/s, 24 fps, 24 tbr, 24 tbn, 24 tbc
    Stream #0:1: Audio: ac3 ([0] [0][0] / 0x2000), 48000 Hz, 5.1(side), fltp, 448 kb/s

-profile:v 77  : I tested values and in the end found one in ffmpeg sources. Are these values documented anywhere?

B-Frames : so basically VCE is useless in its B-Frame-less state?
I was hoping to create a tvheadend streaming server with live hw-accelerated transcoding, is this at all possible with AMD VCE cards?
Comment 5 Alex Deucher 2017-05-11 14:03:41 UTC
I'm not too familiar with VAAPI, but for transcoding, you really need efficient pipelining between the decode and the encode.  If there are CPU copies in the middle, performance won't be great.  We generally recommend using gstreamer with OpenMax using tunneling so that there are no extra copies between the decode and encode stages of the transcode.  I'm not sure if VAAPI supports something like this.
Comment 6 Andy Furniss 2017-05-11 16:32:17 UTC
(In reply to Martin Bednar from comment #4)
> The file is the OSS Elephant's dream movie : https://orange.blender.org/
> Purposely tested with this for easy sharing.
> Contained streams (ffmpeg -i ): 
>     Stream #0:0: Video: msmpeg4v2 (MP42 / 0x3234504D), yuv420p, 1920x1080,
> 10002 kb/s, 24 fps, 24 tbr, 24 tbn, 24 tbc
>     Stream #0:1: Audio: ac3 ([0] [0][0] / 0x2000), 48000 Hz, 5.1(side),
> fltp, 448 kb/s

I think ffmpeg will silently fall back to s/w decode with this file as it's not normal h264.

If you don't specify a bitrate it seems ffmpeg will use cqp = 20 (which will come out quite high bitrate on some content).

> -profile:v 77  : I tested values and in the end found one in ffmpeg sources.
> Are these values documented anywhere?

Not sure about ffmpeg, but they are standard numbers in the world of h264

> B-Frames : so basically VCE is useless in its B-Frame-less state?

I wouldn't go that far, for realtime encoding it is useful and libx264 with realtime settings wouldn't use b-frames either (well depends on how much CPU you have available in practice).
For example my card can do 2160p60 realtime in an artificial test - though in practice for say, game/screen recording I would need hardware CSC which Windows may have, but linux doesn't.

> I was hoping to create a tvheadend streaming server with live hw-accelerated
> transcoding, is this at all possible with AMD VCE cards?

If the input is progressive h.264 maybe - VCE doesn't encode interlaced which could be an issue depending on what your local broadcasters use.

TV tends to be quite low bitrate anyway - if you are not reducing size then re-encoding may not be the best way to go.

gstreamer can, for my dual instance VCE card, be faster than ffmpeg, on your APU I don't know whether it would be.

gstreamer can use vaapi (I don't think a current mesa regression affects transcoding). OMX can be used, but it's cqp only, with vaapi you can target bitrates.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.