Bug 91281 - Tonga VCE 2160p encode fails with BO to small for addr
Summary: Tonga VCE 2160p encode fails with BO to small for addr
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-07-09 10:38 UTC by Andy Furniss
Modified: 2016-12-06 09:57 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Andy Furniss 2015-07-09 10:38:48 UTC
I don't know what the VCE is supposed to do on this R9 285 card but testing with 2160p gives.

amdgpu: The CS has been rejected, see dmesg for more information.

[drm:amdgpu_vce_cs_reloc [amdgpu]] *ERROR* BO to small for addr 0x0002450000 14 13

Same test works with 1080p (bit slow but I haven't tested/understood the PM situation yet).
Comment 1 Andy Furniss 2016-07-29 23:08:37 UTC
Now that vaapi is enabled the situation with this is more mixed.

omx still just always fails.

ffmpeg vaapi always works.

gatreamer vaapi fails or works depending on framerate, which is a bit confusing.

For 2160p as long as I say framerate <=30 it works. For 4096x2160 it has to be <=26.

Adding size to the kernel error message shows that the failing size = 2x raw framesize -

[drm:amdgpu_vce_cs_reloc [amdgpu]] *ERROR* BO to small for addr 0x0108fde000 14 13 24883200

Adding debugging to vlVaCreateBuffer doesn't show any difference between working and failing cases - the size requested being one raw frame.
Comment 2 Andy Furniss 2016-08-01 10:19:14 UTC
More debugging and size 24883200 works fine in the working case, it's

mapping->it.last

that varies between working and failing.

                    size     mapping->it.last

0x0108fde000 14 13 24883200 1101026 = OK

0x0108fde000 14 13 24883200 1088786 = fail
Comment 3 Andy Furniss 2016-08-01 23:49:38 UTC
I know what causes this now.

It's the get_cpb_num case statement in radeon_vce.c.

It only handles up to 51 and defaults to 42.

ffmpeg lucked into working because it calls 51 when it should really be 52.

gstreamer vaapi calls correctly so only > 2160p30 fails as 52 gets default.

gstreamer omx always fails because it seems to max at 42.
Comment 4 Christian König 2016-08-03 12:24:32 UTC
Nice catch. Brave enough to provide a patch or should Leo and I take a look?

I'm just back from vacation and so busy that I probably won't come to it before the end of the month.
Comment 5 Andy Furniss 2016-08-03 23:13:53 UTC
I can do a patch, but I'll need time to think/look/test more.

First thought = easy just add 52 and maybe make default vary with chip type as done elsewhere, will fix gst-vaapi, but not going to help OMX.

Initial testing/looking it seems gst-omx doesn't go up to 5.2 plus gstreamer (or me + gstreamer) doesn't seem able to pass level anyway so it picks it up from the state tracker default - 4.2, changing this to 5.1 gets me 2160p encoding - but I guess I can't do anything platform specific in there.

Maybe if fps is available in addition to width/height in radeon_vce.c ignore what level is sent by player as they may be wrong/not say anyway and work it out?

Of course I don't currently know if this is possible, what implications there may be and what to do about different hardware capabilities.
Comment 6 Andy Furniss 2016-08-07 23:32:46 UTC
Maybe if enc->base.level is unreliable I shouldn't trust enc->base.max_references either, but it seems that the h/w gets set up using it -

radeon_vce_40_2_2.c
RVCE_CS(MAX2(enc->base.max_references, 1) - 1); // encBPicPattern
RVCE_CS(MIN2(enc->base.max_references, 2)); // encNumberOfReferenceFrames
RVCE_CS(enc->base.max_references + 1); // encMaxNumRefFrames

radeon_vce_52.c:
enc->enc_pic.pc.enc_b_pic_pattern = MAX2(enc->base.max_references, 1) - 1;
enc->enc_pic.pc.enc_number_of_reference_frames = MIN2(enc->base.max_references, 2);
enc->enc_pic.pc.enc_max_num_ref_frames = enc->base.max_references + 1;

So unless I miss something the whole function body of get_cpb_num could be replaced with just

return MIN2(enc->base.max_references + 1, 16);

It works testing wise with level 5.2 and doesn't need the level given by the encoder/omx state tracker to be correct. It also saves allocating buffers that aren't really used.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.