[Libva] [PATCH v3 intel-driver 3/9] decoder: h264: simplify and optimize reference frame store updates.

Gwenole Beauchesne gb.devel at gmail.com
Tue Jun 3 21:59:12 PDT 2014


2014-06-04 4:11 GMT+02:00 Zhao, Yakui <yakui.zhao at intel.com>:
> On Tue, 2014-06-03 at 10:43 -0600, Gwenole Beauchesne wrote:
>> Simplify and optimize the update process of the reference frame store.
>> Use less iterations to look up existing objects. Use a cache to store
>> the free'd slots.
>>
>> Prerequisite: the reference_objects[] array was previously arranged in
>> a way that the element at index i is exactly the object_surface that
>> corresponds to the VA surface identified by the VAPictureH264.picture_id
>> located at index i in the ReferenceFrames[] array.
>
> As a whole, this patch looks good to me. Based on the hardware
> requirement, it will be better that one reference picture later used is
> not assigned for the other purpose even when it is not used in current
> reference frame list. In such case this patch does some optimization
> about DPB, which can defer the usage of one picture later used that is
> not in current reference frame.especially when: one picture is not used
> during decoding frame X while it is used during decoding frame X + 1.

The case you mention was supposed to be fixed with the other patch
series where the re-factorisation is a pre-requisite. :)

A very strict GenFrameStoreContext needs to be maintained with the set
of used_frames[] as it is today, and an additional set of
free_frames[] *candidates*. Though, this case is not covered by the
conformance suite, and I haven't found any real world case exhibiting
that, neither did I try to generate one yet. So we can defer this one.

> Although it still has no improvment for the following scenario, it still
> is an improvement.
>     >One picture is not used during decoding frame X and X + 1 while it
> is used during decoding frame X + 2.
>
> Of course the perfect solution is that the upper layer can assure that
> one picture later used is still added into the DPB of current frame
> although it is not used in current reference list.

No, this is not going to happen. This would be an implementation error
of the standard.

>> Theory of operations:
>>
>> 1. Obsolete entries are removed first, i.e. entries in the internal DPB
>>    that no longer have a match in the supplied ReferenceFrames[] array.
>>    That obsolete entry index is stored in a local cache: free_slots[].
>>
>> 2. This cache is completed with entries considered as "invalid" or "not
>>    present", sequentially while traversing the frame store for obsolete
>>    entries. At the end of this removal process, the free_slots[] array
>>    represents all possible indices in there that could be re-used for
>>    new reference frames to track.
>>
>> 3. The list of ReferenceFrames[] objects is traversed for new entries
>>    that are not already in the frame store. If an entry needs to be
>>    added, it is placed at the index obtained from the next free_slots[]
>>    element. There is no need to traverse the frame store array again,
>>    the next available slot can be known from that free_slots[] cache.
>>
>> Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne at intel.com>
>> ---
>>  src/i965_decoder_utils.c |  107 ++++++++++++++++++++--------------------------
>>  1 file changed, 46 insertions(+), 61 deletions(-)
>>
>> diff --git a/src/i965_decoder_utils.c b/src/i965_decoder_utils.c
>> index 2570be9..c686235 100644
>> --- a/src/i965_decoder_utils.c
>> +++ b/src/i965_decoder_utils.c
>> @@ -418,88 +418,73 @@ gen6_send_avc_ref_idx_state(
>>  }
>>
>>  void
>> -intel_update_avc_frame_store_index(VADriverContextP ctx,
>> -                                   struct decode_state *decode_state,
>> -                                   VAPictureParameterBufferH264 *pic_param,
>> -                                   GenFrameStore frame_store[MAX_GEN_REFERENCE_FRAMES])
>> +intel_update_avc_frame_store_index(
>> +    VADriverContextP              ctx,
>> +    struct decode_state          *decode_state,
>> +    VAPictureParameterBufferH264 *pic_param,
>> +    GenFrameStore                 frame_store[MAX_GEN_REFERENCE_FRAMES]
>> +)
>>  {
>> -    int i, j;
>> -
>> -    assert(MAX_GEN_REFERENCE_FRAMES == ARRAY_ELEMS(pic_param->ReferenceFrames));
>> -
>> -    for (i = 0; i < MAX_GEN_REFERENCE_FRAMES; i++) {
>> -        int found = 0;
>> -
>> -        if (frame_store[i].surface_id == VA_INVALID_ID ||
>> -            frame_store[i].obj_surface == NULL)
>> +    int free_slots[MAX_GEN_REFERENCE_FRAMES];
>> +    int i, j, n, num_free_slots;
>> +    bool found;
>> +
>
> It will be better to fill the free_slots through the following two loops
> instead of one loop, which is to try to defer the usage of the reference
> picture later used that is not used in current reference frame list.
> a. check the invalid surface_id or NULL surface and then add it into
> free_slots.
> b. check whether one picture is used by the current reference frame
> list. If not, add it into the free_slots.

The previous algorithm was filling in the holes. This one does the
same. I don't see where your proposal could bring an improvement?

>> +    /* Remove obsolete entries from the internal DPB */
>> +    for (i = 0, n = 0; i < MAX_GEN_REFERENCE_FRAMES; i++) {
>> +        GenFrameStore * const fs = &frame_store[i];
>> +        if (fs->surface_id == VA_INVALID_ID || !fs->obj_surface) {
>> +            free_slots[n++] = i;
>>              continue;
>> +        }
>>
>> -        assert(frame_store[i].frame_store_id != -1);
>> -
>> -        for (j = 0; j < MAX_GEN_REFERENCE_FRAMES; j++) {
>> -            VAPictureH264 *ref_pic = &pic_param->ReferenceFrames[j];
>> -            if (ref_pic->flags & VA_PICTURE_H264_INVALID)
>> +        // Find whether the current entry is still a valid reference frame
>> +        found = false;
>> +        for (j = 0; j < ARRAY_ELEMS(decode_state->reference_objects); j++) {
>> +            struct object_surface * const obj_surface =
>> +                decode_state->reference_objects[j];
>> +            if (!obj_surface)
>>                  continue;
>> -
>> -            if (frame_store[i].surface_id == ref_pic->picture_id) {
>> -                found = 1;
>> +            if ((found = obj_surface == fs->obj_surface))
>>                  break;
>> -            }
>>          }
>>
>> -        /* remove it from the internal DPB */
>> +        // ... or remove it
>>          if (!found) {
>> -            frame_store[i].surface_id = VA_INVALID_ID;
>> -            frame_store[i].frame_store_id = -1;
>> -            frame_store[i].obj_surface = NULL;
>> +            fs->surface_id = VA_INVALID_ID;
>> +            fs->obj_surface = NULL;
>> +            fs->frame_store_id = -1;
>> +            free_slots[n++] = i;
>>          }
>>      }
>> +    num_free_slots = n;
>>
>> -    for (i = 0; i < MAX_GEN_REFERENCE_FRAMES; i++) {
>> -        VAPictureH264 *ref_pic = &pic_param->ReferenceFrames[i];
>> -        int found = 0;
>> -
>> -        if (ref_pic->flags & VA_PICTURE_H264_INVALID ||
>> -            ref_pic->picture_id == VA_INVALID_SURFACE ||
>> -            decode_state->reference_objects[i] == NULL)
>> +    /* Append the new reference frames */
>> +    for (i = 0, n = 0; i < ARRAY_ELEMS(decode_state->reference_objects); i++) {
>> +        struct object_surface * const obj_surface =
>> +            decode_state->reference_objects[i];
>> +        if (!obj_surface)
>>              continue;
>>
>> +        // Find whether the current frame is not already in our frame store
>> +        found = false;
>>          for (j = 0; j < MAX_GEN_REFERENCE_FRAMES; j++) {
>> -            if (frame_store[j].surface_id == ref_pic->picture_id) {
>> -                found = 1;
>> +            if ((found = frame_store[j].obj_surface == obj_surface))
>>                  break;
>> -            }
>>          }
>>
>> -        /* add the new reference frame into the internal DPB */
>> +        // ... or add it
>>          if (!found) {
>> -            int frame_idx;
>> -            int slot_found;
>> -            struct object_surface *obj_surface = decode_state->reference_objects[i];
>> -
>> -            slot_found = 0;
>> -            frame_idx = -1;
>> -            /* Find a free frame store index */
>> -            for (j = 0; j < MAX_GEN_REFERENCE_FRAMES; j++) {
>> -                if (frame_store[j].surface_id == VA_INVALID_ID ||
>> -                    frame_store[j].obj_surface == NULL) {
>> -                    frame_idx = j;
>> -                    slot_found = 1;
>> -                    break;
>> -                }
>> +            if (n < num_free_slots) {
>> +                GenFrameStore * const fs = &frame_store[free_slots[n++]];
>> +                fs->surface_id = obj_surface->base.id;
>> +                fs->obj_surface = obj_surface;
>> +                fs->frame_store_id = fs - frame_store;
>> +            }
>> +            else {
>> +                WARN_ONCE("No free slot found for DPB reference list!!!\n");
>>              }
>> -
>> -
>> -         if (slot_found) {
>> -                frame_store[j].surface_id = ref_pic->picture_id;
>> -                frame_store[j].frame_store_id = frame_idx;
>> -                frame_store[j].obj_surface = obj_surface;
>> -            } else {
>> -                WARN_ONCE("Not free slot for DPB reference list!!!\n");
>> -         }
>>          }
>>      }
>> -
>>  }
>>
>>  void
>
>


More information about the Libva mailing list