[VDPAU] [PATCH] Extend the VDPAU API to support H.265/HEVC Decoding

José Hiram Soltren jsoltren at nvidia.com
Thu Jan 29 13:30:21 PST 2015


I will offer a follow-on patch for review based on the items below. Please see
my responses to Aaron's points in particular below.

Aaron, in offline conversation you raised two good points.

1. The ordering of fields here should agree with the ordering of fields as
presented in the VPS, PPS and SPS descriptions in Clause 7 of the Specification.

2. You were worried that the API here may not be sufficiently generic for
non-NVIDIA hardware implementation. I maintain that this API provides enough
data for any implementation of the decoding process specified in clauses 8.4
through 8.7 to be able to proceed.

Please read on below.

On 01/29/2015 02:42 PM, Aaron Plattner wrote:
> On 01/26/2015 10:47 AM, José Hiram Soltren wrote:
>> This patch adds an API for player applications to utilize VDPAU for
>> hardware-accelerated playback of H.265/HEVC streams.
>>
>> The goals of this API are:
>> - enable hardware accelerated decoding of H.265/HEVC content under VDPAU;
>> - provide a reference implementation for H.265/HEVC hardware decoding that
>>    is vendor agnostic;
>> - provide enough data for H.265/HEVC hardware acceleration implementations
>>    from multiple vendors to be able to use the same API;
>>
>> This patch is written against "version one" of the H.265/HEVC Specification,
>> Rec. ITU-T H.265 (04/2013), available at:
>>
>>      http://www.itu.int/rec/T-REC-H.265-201304-I
> 
> This link seems to be broken.  In an offline conversation, you suggested this
> instead:
> 
>   http://www.itu.int/ITU-T/recommendations/rec.aspx?rec=11885
> 
> That seems iffy from a http://www.w3.org/Provider/Style/URI standpoint, but
> meh.  I'll substitute that URL for the one in the patch if nothing else needs
> to change.

Yes. In that case http://handle.itu.int/11.1002/1000/12296 is likely a better
choice. This is the "Citation" URI offered by the ITU.

>> A future patch against this header may address bug fixes, and may support
>> the new features described in "version two" of the H.265/HEVC Specification,
>> Rec. ITU-T H.265 v2 (10/2014).
>>
>> Note that the API does need to be self documenting with Doxygen markup,
>> which we (NVIDIA) will generate and post as an update to our public VDPAU
>> documentation.
>>
>> This is version four of this patch. clarifying some documentation
>> related to H.265/HEVC support.
>> ---
>>   include/vdpau/vdpau.h |  308 ++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 307 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/vdpau/vdpau.h b/include/vdpau/vdpau.h
>> index b386825..ea9d927 100644
>> --- a/include/vdpau/vdpau.h
>> +++ b/include/vdpau/vdpau.h
>> @@ -6,7 +6,7 @@
>>   /*
>>    * This copyright notice applies to this header file:
>>    *
>> - * Copyright (c) 2008-2010 NVIDIA Corporation
>> + * Copyright (c) 2008-2015 NVIDIA Corporation
>>    *
>>    * Permission is hereby granted, free of charge, to any person
>>    * obtaining a copy of this software and associated documentation
>> @@ -432,6 +432,26 @@
>>    * Include all slices beginning with start codes 0x000001B6. The slice start
>>    * code must be included for all slices.
>>    *
>> + * \subsection bitstream_hevc H.265/HEVC - High Efficiency Video Codec
>> + *
>> + * Include all video coding layer (VCL) NAL units, with nal_unit_type values
>> + * of 0 (TRAIL_N) through 31 (RSV_VCL31) inclusive. In addition to parsing
>> + * and providing NAL units, an H.265/HEVC decoder application using VDPAU
>> + * for decoding must parse certain values of the first slice segment header
>> + * in a VCL NAL unit and provide it through VdpPictureInfoHEVC. Please see
>> + * the documentation for VdpPictureInfoHEVC below for further details.
>> + *
>> + * The complete slice start code (including the 0x000001 prefix) must be
>> + * included for all slices, even when the prefix is not included in the
>> + * bitstream.
>> + *
>> + * Note that if desired:
>> + *
>> + * - The slice start code prefix may be included in a separate bitstream
>> + *   buffer array entry to the actual slice data extracted from the bitstream.
>> + * - Multiple bitstream buffer array entries (e.g. one per slice) may point at
>> + *   the same physical data storage for the slice start code prefix.
>> + *
>>    * \section video_mixer_usage Video Mixer Usage
>>    *
>>    * \subsection video_surface_content VdpVideoSurface Content
>> @@ -2463,6 +2483,17 @@ typedef uint32_t VdpDecoderProfile;
>>   /** \hideinitializer */
>>   /** \brief Support for 8 bit depth only */
>>   #define VDP_DECODER_PROFILE_H264_HIGH_444_PREDICTIVE   
>> ((VdpDecoderProfile)26)
>> +/** \hideinitializer */
>> +/** \brief MPEG-H Part 2 == H.265 == HEVC */
>> +#define VDP_DECODER_PROFILE_HEVC_MAIN                   ((VdpDecoderProfile)50)
>> +/** \hideinitializer */
>> +#define VDP_DECODER_PROFILE_HEVC_MAIN_10                ((VdpDecoderProfile)51)
>> +/** \hideinitializer */
>> +#define VDP_DECODER_PROFILE_HEVC_MAIN_STILL             ((VdpDecoderProfile)52)
>> +/** \hideinitializer */
>> +#define VDP_DECODER_PROFILE_HEVC_MAIN_12                ((VdpDecoderProfile)53)
>> +/** \hideinitializer */
>> +#define VDP_DECODER_PROFILE_HEVC_MAIN_444               ((VdpDecoderProfile)54)
>>
>>   /** \hideinitializer */
>>   #define VDP_DECODER_LEVEL_MPEG1_NA 0
>> @@ -2558,6 +2589,38 @@ typedef uint32_t VdpDecoderProfile;
>>   #define VDP_DECODER_LEVEL_DIVX_NA 0
>>
>>   /**
>> + * The VDPAU H.265/HEVC decoder levels correspond to the values of
>> + *  general_level_idc as described in the H.265 Specification, Annex A,
> 
> Indentation got a little wonky here.

That is easy enough to fix, yes.

>> + *  Table A.1. The enumeration values are equal to thirty times the level
>> + *  number.
>> + */
>> +#define VDP_DECODER_LEVEL_HEVC_1         30
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_2         60
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_2_1       63
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_3         90
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_3_1       93
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_4        120
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_4_1      123
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_5        150
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_5_1      153
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_5_2      156
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_6        180
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_6_1      183
>> +/** \hideinitializer */
>> +#define VDP_DECODER_LEVEL_HEVC_6_2      186
>> +
>> +/**
>>    * \brief Query the implementation's VdpDecoder capabilities.
>>    * \param[in] device The device to query.
>>    * \param[in] profile The decoder profile for which information is requested.
>> @@ -3054,6 +3117,249 @@ typedef VdpPictureInfoMPEG4Part2 VdpPictureInfoDivX4;
>>   typedef VdpPictureInfoMPEG4Part2 VdpPictureInfoDivX5;
>>
>>   /**
>> + * \brief Picture parameter information for an H.265/HEVC picture.
>> + *
>> + * Note: References to "copy of ... bitstream field" in the field
>> + * descriptions may refer to data literally parsed from the bitstream,
>> + * or derived from the bitstream using a mechanism described in
>> + * Rec. ITU-T H.265 (04/2013), hereafter referred to as
>> + * "the H.265/HEVC Specification".
>> + *
>> + * Client applications must supply every field in this struct.
>> + */
>> +typedef struct {
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint32_t pic_width_in_luma_samples;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint32_t pic_height_in_luma_samples;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t log2_min_luma_coding_block_size_minus3;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t log2_diff_max_min_luma_coding_block_size;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t log2_min_transform_block_size_minus2;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t log2_diff_max_min_transform_block_size;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t pcm_enabled_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field.
>> +        Only needs to be set if pcm_enabled_flag is set. */
>> +    uint8_t log2_min_pcm_luma_coding_block_size_minus3;
>> +    /** Copy of the Sequence Parameter Set bitstream field.
>> +        Only needs to be set if pcm_enabled_flag is set. */
>> +    uint8_t log2_diff_max_min_pcm_luma_coding_block_size;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t bit_depth_luma_minus8;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t bit_depth_chroma_minus8;
>> +    /** Copy of the Sequence Parameter Set bitstream field.
>> +        Only needs to be set if pcm_enabled_flag is set. */
>> +    uint8_t pcm_sample_bit_depth_luma_minus1;
>> +    /** Copy of the Sequence Parameter Set bitstream field.
>> +        Only needs to be set if pcm_enabled_flag is set. */
>> +    uint8_t pcm_sample_bit_depth_chroma_minus1;
>> +    /** Copy of the Sequence Parameter Set bitstream field.
>> +        Only needs to be set if pcm_enabled_flag is set. */
>> +    uint8_t pcm_loop_filter_disabled_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t strong_intra_smoothing_enabled_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t max_transform_hierarchy_depth_intra;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t max_transform_hierarchy_depth_inter;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t amp_enabled_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field.
>> +        Only valid if chroma_format_idc == 3. */
>> +    uint8_t separate_colour_plane_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t log2_max_pic_order_cnt_lsb_minus4;
>> +    /** Copy of the Sequence Parameter Set bitstream field.
>> +        Per spec, when zero, assume short_term_ref_pic_set_sps_flag
>> +        is also zero. */
>> +    uint8_t num_short_term_ref_pic_sets;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t long_term_ref_pics_present_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field.
>> +        Only needed if long_term_ref_pics_present_flag is set. */
>> +    uint8_t num_long_term_ref_pics_sps;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t sps_temporal_mvp_enabled_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t sample_adaptive_offset_enabled_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t scaling_list_enabled_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t chroma_format_idc;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t sps_sub_layer_ordering_info_present_flag;
>> +    /** Copy of the Sequence Parameter Set bitstream field. */
>> +    uint8_t sps_max_dec_pic_buffering_minus1;
>> +
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t dependent_slice_segments_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t slice_segment_header_extension_present_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t sign_data_hiding_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t cu_qp_delta_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only needed if cu_qp_delta_enabled_flag is set. */
>> +    uint8_t diff_cu_qp_delta_depth;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    int8_t init_qp_minus26;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    int8_t pps_cb_qp_offset;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    int8_t pps_cr_qp_offset;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t constrained_intra_pred_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t weighted_pred_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t weighted_bipred_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t transform_skip_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t transquant_bypass_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t entropy_coding_sync_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t log2_parallel_merge_level_minus2;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t num_extra_slice_header_bits;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only needed if tiles_enabled_flag is set. */
>> +    uint8_t loop_filter_across_tiles_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t pps_loop_filter_across_slices_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t output_flag_present_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t num_ref_idx_l0_default_active_minus1;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t num_ref_idx_l1_default_active_minus1;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t lists_modification_present_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t cabac_init_present_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t pps_slice_chroma_qp_offsets_present_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t deblocking_filter_control_present_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only valid if deblocking_filter_control_present_flag is set. */
>> +    uint8_t deblocking_filter_override_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only valid if deblocking_filter_control_present_flag is set. */
>> +    uint8_t pps_deblocking_filter_disabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only valid if deblocking_filter_control_present_flag is set. */
>> +    int8_t pps_beta_offset_div2;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only valid if deblocking_filter_control_present_flag is set. */
>> +    int8_t pps_tc_offset_div2;
>> +    /** Copy of the Picture Parameter Set bitstream field. */
>> +    uint8_t tiles_enabled_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only valid if tiles_enabled_flag is set. */
>> +    uint8_t uniform_spacing_flag;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only valid if tiles_enabled_flag is set. */
>> +    uint8_t num_tile_columns_minus1;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only valid if tiles_enabled_flag is set. */
>> +    uint8_t num_tile_rows_minus1;
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only need to set 0..num_tile_columns_minus1. The struct
>> +        definition reserves up to the maximum of 22. */
>> +    uint16_t column_width_minus1[22];
>> +    /** Copy of the Picture Parameter Set bitstream field.
>> +        Only need to set 0..num_tile_rows_minus1. The struct
>> +        definition reserves up to the maximum of 20. */
>> +    uint16_t row_height_minus1[20];
> 
> I know it's not the current convention, but it would make vdpau.h easier to
> read if these used Doxygen's member group syntax.  E.g., something like this:
> 
>     /** \name Sequence Parameter Set
>      *
>      * These fields are copies of the data from the Sequence Parameter Set
>      * bitstream fields.
>      */
>     /** @{ */
>     uint32_t pic_width_in_luma_samples;
>     uint32_t pic_height_in_luma_samples;
>     uint8_t log2_min_luma_coding_block_size_minus3;
>     uint8_t log2_diff_max_min_luma_coding_block_size;
>     uint8_t log2_min_transform_block_size_minus2;
>     uint8_t log2_diff_max_min_transform_block_size;
>     uint8_t pcm_enabled_flag;
>     /** Only needs to be set if pcm_enabled_flag is set. */
>     uint8_t log2_min_pcm_luma_coding_block_size_minus3;
>     ...
>     /** @} */
> 
>     /** \name Picture Parameter Set
>      *
>      * These fields are copies of the data from the Picture Parameter Set
>      * bitstream fields.
>      */
>     /** @{ */
>     ...
> 
> If that sounds reasonable, I'll put together a patch to change the other
> PictureInfo structures to use that cleanup.

Sure. This de-duplicate the comments. I just know that we need to be careful to
explicitly call attention to fields not specified in the Specification, such as
NumShortTermPictureSliceHeaderBits and NumLongTermPictureSliceHeaderBits.

If you do undertake such a change - perhaps wait until I have sent the
follow-on to this patch?

It would be good to check that all calculated values use the
UpperCamelCaseNamingConvention that the Specification uses. This is in contrast
to the lower_case_underscored_naming_convention that the Specification uses for
explicitly signalled values.

Thanks,
--José

>> +    // Reference Picture Sets
>> +    /** Copy of specification field, CurrRpsIdx, see 7.4.7.1. */
>> +    uint32_t CurrRpsIdx;
>> +    /** Set to 1 if nal_unit_type is equal to IDR_W_RADL or IDR_N_LP.
>> +        Set to zero otherwise. */
>> +    uint32_t IDRPicFlag;
>> +    /** Set to 1 if nal_unit_type in the range of BLA_W_LP to
>> +        RSV_IRAP_VCL23, inclusive. Set to zero otherwise.*/
>> +    uint32_t RAPPicFlag;
>> +    /** Corresponds to specification field, NumDeltaPocs[RefRpsIdx].
>> +        Only applicable when short_term_ref_pic_set_sps_flag == 0.
>> +        Implementations will ignore this value in other cases. */
>> +    uint32_t NumDeltaPocsOfRefRpsIdx;
>> +    /** Copy of specification field, see Section 7.4.7.2 of the
>> +        H.265/HEVC Specification. */
>> +    uint32_t NumPocTotalCurr;
>> +    /** Copy of specification field, see Section 8.3.2 of the
>> +        H.265/HEVC Specification. */
>> +    uint32_t NumPocStCurrBefore;
>> +    /** Copy of specification field, see Section 8.3.2 of the
>> +        H.265/HEVC Specification. */
>> +    uint32_t NumPocStCurrAfter;
>> +    /** Copy of specification field, see Section 8.3.2 of the
>> +        H.265/HEVC Specification. */
>> +    uint32_t NumPocLtCurr;
>> +    /** Section 7.6.3.1 of the H.265/HEVC Specification defines the syntax of
>> +        the slice_segment_header. This header contains information that
>> +        some VDPAU implementations may choose to skip. The VDPAU API
>> +        requires client applications to track the number of bits used in the
>> +        slice header for structures associated with short term and long term
>> +        reference pictures. First, VDPAU requires the number of bits used by
>> +        the short_term_ref_pic_set array in the slice_segment_header. */
>> +    uint32_t NumShortTermPictureSliceHeaderBits;
>> +    /** Second, VDPAU requires the number of bits used for long term reference
>> +        pictures in the slice_segment_header. This is equal to the number
>> +        of bits used for the contents of the block beginning with
>> +        "if(long_term_ref_pics_present_flag)". */
>> +    uint32_t NumLongTermPictureSliceHeaderBits;
>> +    /** The value of PicOrderCntVal of the picture in the access unit
>> +        containing the SEI message. The picture being decoded. */
>> +    int32_t CurrPicOrderCntVal;
>> +    /** Array of video reference surfaces.
>> +        Set any unused positions to VDP_INVALID_HANDLE. */
>> +    VdpVideoSurface RefPics[16];
>> +    /** Array of picture order counts. These correspond to positions
>> +        in the RefPics array. */
>> +    int32_t PicOrderCntVal[16];
>> +    /** Array used to specify whether a particular RefPic is
>> +        a long term reference. A value of "1" indicates a long-term
>> +        reference. */
>> +    uint8_t IsLongTerm[16];
>> +    /** Reference Picture Set list, one of the short-term RPS. */
>> +    int8_t RefPicSetStCurrBefore[8];
>> +    /** Reference Picture Set list, one of the short-term RPS. */
>> +    int8_t RefPicSetStCurrAfter[8];
>> +    /** Reference Picture Set list, one of the long-term RPS. */
>> +    int8_t RefPicSetLtCurr[8];
>> +
>> +    // scaling lists (diag order)
>> +    /** Scaling List for 4x4 quantization matrix,
>> +       indexed as ScalingList4x4[matrixId][i]. */
>> +    uint8_t ScalingList4x4[6][16];
>> +    /** Scaling List for 8x8 quantization matrix,
>> +       indexed as ScalingList8x8[matrixId][i]. */
>> +    uint8_t ScalingList8x8[6][64];
>> +    /** Scaling List for 16x16 quantization matrix,
>> +       indexed as ScalingList16x16[matrixId][i]. */
>> +    uint8_t ScalingList16x16[6][64];
>> +    /** Scaling List for 32x32 quantization matrix,
>> +       indexed as ScalingList32x32[matrixId][i]. */
>> +    uint8_t ScalingList32x32[2][64];
>> +    /** Scaling List DC Coefficients for 16x16,
>> +       indexed as ScalingListDCCoeff16x16[matrixId]. */
>> +    uint8_t ScalingListDCCoeff16x16[6];
>> +    /** Scaling List DC Coefficients for 32x32,
>> +       indexed as ScalingListDCCoeff32x32[matrixId]. */
>> +    uint8_t ScalingListDCCoeff32x32[2];
>> +} VdpPictureInfoHEVC;
>> +
>> +/**
>>    * \brief Decode a compressed field/frame and render the result
>>    *        into a \ref VdpVideoSurface "VdpVideoSurface".
>>    * \param[in] decoder The decoder object that will perform the
>>
> 
> 


More information about the VDPAU mailing list