[VDPAU] [PATCH 2/2] [PATCH] Enable lossless video playback via VDPAU

Fri Oct 31 11:11:38 PDT 2014

Hello,

On 2014/10/29, 1:32 , "Rémi Denis-Courmont" <remi at remlab.net> wrote:

>Le 2014-10-29 06:56, ksreenivasan at nvidia.com a écrit :
>> From: Karthikeyan Sreenivasan <ksreenivasan at nvidia.com>
>>
>> Exposes the hardware flags via VDPAU to enable lossess video playback
>
>Nit: "lossless"
>
>> for x264 and JM encoded bit streams.
>>
>> This revision adds a new struct VdpPictureInfoH264Extended which
>> contains a struct_version and the transform_bypass which can take
>> the follwing values,
>
>Nit: "following"
>
>> 0 - No lossless
>> 1 - x264 encoded streams
>> 2 - JM encoded streams

Why are there different values for these stream encoders? It would be good
to document these.

We discussed this internally some. The NVIDIA hardware itself contains a
workaround for an x264 bug around intra8x8 modes while encoding. This was
recently fixed upstream in x264. For some background discussion, please
see:

https://mailman.videolan.org/pipermail/x264-devel/2013-June/010152.html

There are two bits that a decoder needs to set with NVIDIA hardware:
- whether or not lossless decoding is enabled, and
- whether or not we need a workaround for old x264's broken handling of
intra8x8 modes.

I am not certain if other VDPAU implementations would require a special
case such as this.

>> The transform_bypass field will be used to populate lossless fields
>> when passing values to NVVA based on the type of lossless encoding
>> and then passed on to hardware.
>>
>> Added H264_HIGH_444PP profile which was not present before.
>
>Since all defined VDPAU video surface chroma types so far have 8-bits
>depth, I assume this only supports 8-bits... Probably that should be
>documented.

Perhaps. The VDPAU API could conceivably provide enumerations for 10-bit,
12-bit, etc. decoding; it is up to an implementation to support them. My
understanding is that lossless encoding is a only defined in the H.264
High 444PP profile, which also defines higher bit depth encodings.

>>This revision to vdpau.h would require an API change.
>> ---
>>  include/vdpau/vdpau.h |   29 ++++++++++++++++++++++++-----
>>  1 file changed, 24 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/vdpau/vdpau.h b/include/vdpau/vdpau.h
>> index 22497e2..4d43bb0 100644
>> --- a/include/vdpau/vdpau.h
>> +++ b/include/vdpau/vdpau.h
>> @@ -2437,6 +2437,8 @@ typedef uint32_t VdpDecoderProfile;
>>  #define VDP_DECODER_PROFILE_DIVX5_HOME_THEATER
>> (VdpDecoderProfile)20
>>  /** \hideinitializer */
>>  #define VDP_DECODER_PROFILE_DIVX5_HD_1080P
>> (VdpDecoderProfile)21
>> +/** \hideinitializer */
>> +#define VDP_DECODER_PROFILE_H264_HIGH_444PP
>> (VdpDecoderProfile)22
>>
>>  /** \hideinitializer */
>>  #define VDP_DECODER_LEVEL_MPEG1_NA 0
>> @@ -2813,11 +2815,7 @@ typedef struct {
>>      uint8_t  deblocking_filter_control_present_flag;
>>      /** Copy of the H.264 bitstream field. */
>>      uint8_t  redundant_pic_cnt_present_flag;
>> -    /** Copy of the H.264 bitstream field. */
>> -    uint8_t lossless_ipred8x8_filter_enable;
>> -    /** Copy of the H.264 bitstream field. */
>> -    uint8_t qpprime_y_zero_transform_bypass_flag;
>> -
>> +
>
>Removing members of a public structure breaks the binary compatibility.
>I think that's a bad idea.

Is this perhaps a reverse patch by accident? Or perhaps this reflects a
local working copy? Fields such as "lossless_ipred8x8_filter_enable" are
not present in what is the most recent version of vdpau.h as of this
writing:

http://cgit.freedesktop.org/~aplattner/libvdpau/tree/include/vdpau/vdpau.h

>I'd leave the members as they are and note that they're not actually
>used (or that they must be zero, FWIW).
>
>>      /** Copy of the H.264 bitstream field, converted to raster
>> order. */
>>      uint8_t scaling_lists_4x4[6][16];
>>      /** Copy of the H.264 bitstream field, converted to raster
>> order. */
>> @@ -2827,6 +2825,27 @@ typedef struct {
>>      VdpReferenceFrameH264 referenceFrames[16];
>>  } VdpPictureInfoH264;
>>
>> +#define VDP_H264_EXTENDED_VERSION 0
>> +
>> +/**
>> + * \brief Extended picture parameter information for an H.264
>> picture.
>> + *
>> + * Note: Contains field for lossless decode.
>> + */
>> +typedef struct {
>> +    /**
>> +     * This field must be filled with VDP_H264_EXTENDED_VERSION
>> +     */
>> +    uint32_t struct_version;
>> +    /** Performs lossless decode
>> +     * 0: no lossless decode
>> +     * 1: x264 lossless decode
>> +     * 2: JM lossless decode
>> +     * >2: unsupported lossless decode stream
>> +     */
>> +    uint8_t transform_bypass;
>> +} VdpPictureInfoH264Extended;
>> +
>
>It's not clear where/when that structure should be used by application.
>I think the documentation should probably explain that.

I believe this is a workaround for the fact that VdpPictureInfoH264 is not
versioned. Ideally we would add this new transform_bypass structure to the
end of VdpPictureInfoH264, were it versioned.

There are a number of workable solutions, and I am curious to know what
others on this list believe would be a good solution. My proposals include:

- break backwards compatibility by making VdpPictureInfoH264 (and all
other picture info structures) versioned, then adding this new field to
the newly versioned structure;

- create a new versioned structure (in the same way that e.g. VdpProcamp
is versioned) to be used in addition to VdpPictureInfoH264 for supplying
data to a VdpDecoder for rendering (this is what this patch proposes, I
think) - this would also include changing the VdpDecoderRender interface;

- create a new versioned structure (similar to the immediately previous
proposal) that contains an old VdpPictureInfoH264 /and/ new attributes
such as transform_bypass, to be used exclusively with new decoder profiles
such as the aforementioned VDP_DECODER_PROFILE_H264_HIGH_444PP;

- add new attributes such as transform_bypass to the end of
VdpPictureInfoH264, and use a backwards-compatible hack to make
VdpPictureInfoH264 a versioned structure. For example: the first field of
all actually versioned structures in the VDPAU API is struct_version, a
uint32_t. The first field of VdpPictureInfoH264 is slice_count, a
uint32_t. Empirically it does not make sense to have more slices in a
picture than there are pixels (samples) in that same picture. Could we
perhaps place a restriction on the size of slice_count, and use some high
bit(s) of slice_count to encode a version number that an implementation
should know to check, and a decoder application should know to set? Again,
this is a hack.

I'm curious to know, which of the proposals (regarding both versioning,
and the intra8x8 workaround) are most favorable to a VDPAU implementation
implementor? To a decoder application maintainer?

Thanks,
--José

>>  /**
>>   * \brief Picture parameter information for a VC1 picture.
>>   *
>
>-- 
>Rémi Denis-Courmont

--
nvpublic