[Libva] How to detect the type of memory returned...
Jean-Yves Avenard
jyavenard at gmail.com
Sun Jun 22 16:03:49 PDT 2014
On 22 June 2014 20:16, Peter Frühberger <peter.fruehberger at gmail.com> wrote:
> Hi
>
> 2014-06-22 11:08 GMT+02:00 Jean-Yves Avenard <jyavenard at gmail.com>:
>> On 22 June 2014 19:06, Jean-Yves Avenard <jyavenard at gmail.com> wrote:
> I tested the sse4 copy algorithm vs the OpenGL approach we discussed
> lately. In my testing I used a 1080p24 sample with H264 Level
> 4.1 at High. The average copy time of sse4 was arround 4ms. I benchmarked
> similarly to your testings, see the patch here:
> http://paste.ubuntu.com/7684464/
>
> On the other hand I benchmarked the OpenGL approach. This approach has
> won by more than factor 5 with arround 0.8ms per frame.
you are not testing what I intended to test.
here you are testing a NV12 frame, and vaDeriveImage.
What I intended to show was that, via vaGetImage , not using USWC
memory is *much* faster. And that speed-wise, you are much better of
using vaGetImage instead of vaDeriveImage. Obviously that advantage
would reduce a lot if the by Haihao's patch is applied
Whatever speed gain you are noticing with vaDeriveImage SSE vs OpenGL
would still be even greater should the memory had not been USWC.
I should point that AMD's VAAPI doesn't support vaDeriveImage, so you
must implement both methods regardless
>
> Note: I also measured vaSyncSurface as you did the same, but it has
> nothing to do with the "time" the copy needs, though querying if the
> surface is "not in used" anymore is not really doable for all vaapi
> implementations.
seeing the timing for both instances is for exactly the same
instructions, I'm not sure how that would help proving anything.
I would still see vaGetImage normal memory vs vaGetImage USWC being much faster
More information about the Libva
mailing list