[Libva] How to detect the type of memory returned...

Tue Jun 17 05:44:44 PDT 2014

2014-06-17 13:12 GMT+02:00 Peter Frühberger <peter.fruehberger at gmail.com>:
> Hi,
>
> 2014-06-17 13:10 GMT+02:00 Jean-Yves Avenard <jyavenard at gmail.com>:
>> On 17 June 2014 19:51, Gwenole Beauchesne <gb.devel at gmail.com> wrote:
>>
>>> Note: I am not in the business of doing things "good enough", I want
>>> 100% correctness. :)
>>
>> that's good to know !
>>
>>>
>>> So, for same-size transfers (uploads / downloads) with the same
>>> sub-sampling requirements (i.e. YUV 4:2:0), there shall be a way to
>>> produce and guarantee that what we get on the other side exactly
>>> matches the source. Otherwise, you are risking propagation of errors
>>> for subsequent frames (postprocessing in terms of downloads, quality
>>> of encoding in terms of uploads), thus reducing the overall quality.
>>
>> Well, that whole business of vaGetImage/vaDeriveImage in VLC is only
>> used to display the frame. The decoded frame isn't used as reference
>> frame etc obviously (that's all done within libva)
>>
>> In VLC, using vaGetImage + YV12->YV12 I have a 11% CPU usage to play a
>> 1080/24p h264 video
>> With vaDeriveImage + NV12->YV12 that jumps to 17%
>>
>> For MythTV, I only use that method for PiP. For the main playback
>> routine, we use OpenGL and use vaCopySurfaceGLX to directly draw
>> within the OpenGL surface
>> For PiP obviously, I don't really care how accurate the conversion
>> from NV12->YV12 is...
>
> Does that make it faster:
> https://github.com/FernetMenta/xbmc/commit/ffc92539204d6c2044158fbc05d9292ff839f429
> ?

On any platform that natively supports output to an X pixmap, this
will be an obvious gain. In the distant past, I measured on a very old
Poulsbo, up to +30% increase, i.e. smoother rendering. Reality is I
only needed VA/GLX for XvBA. Since OSS drivers matured a lot, I don't
see the point to stick to proprietary drivers. Anyway, someone from
the VLC team used to maintain the older xvba-driver nowadays though.

If we focus on the Intel hardware, in GLX, using vaPutSurface() + TFP
would not only be the optimum way (for GLX, again), but also the most
correct one since the VA intel-driver does not implement the VA/GLX
API, thus relying on the libva default implementation, which might be
sub-performing and, not to mention, missing certain features like the
ability to downscale along the way. The latter use case would be a win
for example for large stream decode, but smaller display needs (e.g.
1080p decode, 720p render).

The future is around EGL. I don't even see any more spec
changes/additions to the older GLX APIs. I am not talking VA level, I
am talking about OpenGL standards here.

Regards,
Gwenole.