[Libva] [PATCH v3 intel-driver 3/9] decoder: h264: simplify and optimize reference frame store updates.
Zhao, Yakui
yakui.zhao at intel.com
Tue Jun 3 22:30:51 PDT 2014
On Tue, 2014-06-03 at 22:59 -0600, Gwenole Beauchesne wrote:
> 2014-06-04 4:11 GMT+02:00 Zhao, Yakui <yakui.zhao at intel.com>:
> > On Tue, 2014-06-03 at 10:43 -0600, Gwenole Beauchesne wrote:
> >> Simplify and optimize the update process of the reference frame store.
> >> Use less iterations to look up existing objects. Use a cache to store
> >> the free'd slots.
> >>
> >> Prerequisite: the reference_objects[] array was previously arranged in
> >> a way that the element at index i is exactly the object_surface that
> >> corresponds to the VA surface identified by the VAPictureH264.picture_id
> >> located at index i in the ReferenceFrames[] array.
> >
> > As a whole, this patch looks good to me. Based on the hardware
> > requirement, it will be better that one reference picture later used is
> > not assigned for the other purpose even when it is not used in current
> > reference frame list. In such case this patch does some optimization
> > about DPB, which can defer the usage of one picture later used that is
> > not in current reference frame.especially when: one picture is not used
> > during decoding frame X while it is used during decoding frame X + 1.
>
> The case you mention was supposed to be fixed with the other patch
> series where the re-factorisation is a pre-requisite. :)
> A very strict GenFrameStoreContext needs to be maintained with the set
> of used_frames[] as it is today, and an additional set of
> free_frames[] *candidates*. Though, this case is not covered by the
> conformance suite, and I haven't found any real world case exhibiting
> that, neither did I try to generate one yet. So we can defer this one.
>
> > Although it still has no improvment for the following scenario, it still
> > is an improvement.
> > >One picture is not used during decoding frame X and X + 1 while it
> > is used during decoding frame X + 2.
> >
> > Of course the perfect solution is that the upper layer can assure that
> > one picture later used is still added into the DPB of current frame
> > although it is not used in current reference list.
>
> No, this is not going to happen. This would be an implementation error
> of the standard.
In theory it is no problem for normal H264. But it is true for MVC case
as some reference frames for view 1 is not used when decoding view 0.
>
> >> Theory of operations:
> >>
> >> 1. Obsolete entries are removed first, i.e. entries in the internal DPB
> >> that no longer have a match in the supplied ReferenceFrames[] array.
> >> That obsolete entry index is stored in a local cache: free_slots[].
> >>
> >> 2. This cache is completed with entries considered as "invalid" or "not
> >> present", sequentially while traversing the frame store for obsolete
> >> entries. At the end of this removal process, the free_slots[] array
> >> represents all possible indices in there that could be re-used for
> >> new reference frames to track.
> >>
> >> 3. The list of ReferenceFrames[] objects is traversed for new entries
> >> that are not already in the frame store. If an entry needs to be
> >> added, it is placed at the index obtained from the next free_slots[]
> >> element. There is no need to traverse the frame store array again,
> >> the next available slot can be known from that free_slots[] cache.
> >>
> >> Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne at intel.com>
> >> ---
> >> src/i965_decoder_utils.c | 107 ++++++++++++++++++++--------------------------
> >> 1 file changed, 46 insertions(+), 61 deletions(-)
> >>
> >> diff --git a/src/i965_decoder_utils.c b/src/i965_decoder_utils.c
> >> index 2570be9..c686235 100644
> >> --- a/src/i965_decoder_utils.c
> >> +++ b/src/i965_decoder_utils.c
> >> @@ -418,88 +418,73 @@ gen6_send_avc_ref_idx_state(
> >> }
> >>
> >> void
> >> -intel_update_avc_frame_store_index(VADriverContextP ctx,
> >> - struct decode_state *decode_state,
> >> - VAPictureParameterBufferH264 *pic_param,
> >> - GenFrameStore frame_store[MAX_GEN_REFERENCE_FRAMES])
> >> +intel_update_avc_frame_store_index(
> >> + VADriverContextP ctx,
> >> + struct decode_state *decode_state,
> >> + VAPictureParameterBufferH264 *pic_param,
> >> + GenFrameStore frame_store[MAX_GEN_REFERENCE_FRAMES]
> >> +)
> >> {
> >> - int i, j;
> >> -
> >> - assert(MAX_GEN_REFERENCE_FRAMES == ARRAY_ELEMS(pic_param->ReferenceFrames));
> >> -
> >> - for (i = 0; i < MAX_GEN_REFERENCE_FRAMES; i++) {
> >> - int found = 0;
> >> -
> >> - if (frame_store[i].surface_id == VA_INVALID_ID ||
> >> - frame_store[i].obj_surface == NULL)
> >> + int free_slots[MAX_GEN_REFERENCE_FRAMES];
> >> + int i, j, n, num_free_slots;
> >> + bool found;
> >> +
> >
> > It will be better to fill the free_slots through the following two loops
> > instead of one loop, which is to try to defer the usage of the reference
> > picture later used that is not used in current reference frame list.
> > a. check the invalid surface_id or NULL surface and then add it into
> > free_slots.
> > b. check whether one picture is used by the current reference frame
> > list. If not, add it into the free_slots.
>
> The previous algorithm was filling in the holes. This one does the
> same. I don't see where your proposal could bring an improvement?
In your patch you use the free_slots to find one free slot to hold a new
reference picture in DPB. Right? It will firstly try to find one
free_slot with invalid picture_id or surface, which is used to hold a
new reference picture in DPB. It can remove the possibility that one
picture later used is immediately used to hold a new reference picture
in DPB. So based on the above point IMO it is better to find the
free_slots through two steps to hold the new reference picture in DPB.
If this is not the keypoint of your patch, IMO this patch can't meet
with your requirement for MVC.
>
> >> + /* Remove obsolete entries from the internal DPB */
> >> + for (i = 0, n = 0; i < MAX_GEN_REFERENCE_FRAMES; i++) {
> >> + GenFrameStore * const fs = &frame_store[i];
> >> + if (fs->surface_id == VA_INVALID_ID || !fs->obj_surface) {
> >> + free_slots[n++] = i;
> >> continue;
> >> + }
> >>
> >> - assert(frame_store[i].frame_store_id != -1);
> >> -
> >> - for (j = 0; j < MAX_GEN_REFERENCE_FRAMES; j++) {
> >> - VAPictureH264 *ref_pic = &pic_param->ReferenceFrames[j];
> >> - if (ref_pic->flags & VA_PICTURE_H264_INVALID)
> >> + // Find whether the current entry is still a valid reference frame
> >> + found = false;
> >> + for (j = 0; j < ARRAY_ELEMS(decode_state->reference_objects); j++) {
> >> + struct object_surface * const obj_surface =
> >> + decode_state->reference_objects[j];
> >> + if (!obj_surface)
> >> continue;
> >> -
> >> - if (frame_store[i].surface_id == ref_pic->picture_id) {
> >> - found = 1;
> >> + if ((found = obj_surface == fs->obj_surface))
> >> break;
> >> - }
> >> }
> >>
> >> - /* remove it from the internal DPB */
> >> + // ... or remove it
> >> if (!found) {
> >> - frame_store[i].surface_id = VA_INVALID_ID;
> >> - frame_store[i].frame_store_id = -1;
> >> - frame_store[i].obj_surface = NULL;
> >> + fs->surface_id = VA_INVALID_ID;
> >> + fs->obj_surface = NULL;
> >> + fs->frame_store_id = -1;
> >> + free_slots[n++] = i;
> >> }
> >> }
> >> + num_free_slots = n;
> >>
> >> - for (i = 0; i < MAX_GEN_REFERENCE_FRAMES; i++) {
> >> - VAPictureH264 *ref_pic = &pic_param->ReferenceFrames[i];
> >> - int found = 0;
> >> -
> >> - if (ref_pic->flags & VA_PICTURE_H264_INVALID ||
> >> - ref_pic->picture_id == VA_INVALID_SURFACE ||
> >> - decode_state->reference_objects[i] == NULL)
> >> + /* Append the new reference frames */
> >> + for (i = 0, n = 0; i < ARRAY_ELEMS(decode_state->reference_objects); i++) {
> >> + struct object_surface * const obj_surface =
> >> + decode_state->reference_objects[i];
> >> + if (!obj_surface)
> >> continue;
> >>
> >> + // Find whether the current frame is not already in our frame store
> >> + found = false;
> >> for (j = 0; j < MAX_GEN_REFERENCE_FRAMES; j++) {
> >> - if (frame_store[j].surface_id == ref_pic->picture_id) {
> >> - found = 1;
> >> + if ((found = frame_store[j].obj_surface == obj_surface))
> >> break;
> >> - }
> >> }
> >>
> >> - /* add the new reference frame into the internal DPB */
> >> + // ... or add it
> >> if (!found) {
> >> - int frame_idx;
> >> - int slot_found;
> >> - struct object_surface *obj_surface = decode_state->reference_objects[i];
> >> -
> >> - slot_found = 0;
> >> - frame_idx = -1;
> >> - /* Find a free frame store index */
> >> - for (j = 0; j < MAX_GEN_REFERENCE_FRAMES; j++) {
> >> - if (frame_store[j].surface_id == VA_INVALID_ID ||
> >> - frame_store[j].obj_surface == NULL) {
> >> - frame_idx = j;
> >> - slot_found = 1;
> >> - break;
> >> - }
> >> + if (n < num_free_slots) {
> >> + GenFrameStore * const fs = &frame_store[free_slots[n++]];
> >> + fs->surface_id = obj_surface->base.id;
> >> + fs->obj_surface = obj_surface;
> >> + fs->frame_store_id = fs - frame_store;
> >> + }
> >> + else {
> >> + WARN_ONCE("No free slot found for DPB reference list!!!\n");
> >> }
> >> -
> >> -
> >> - if (slot_found) {
> >> - frame_store[j].surface_id = ref_pic->picture_id;
> >> - frame_store[j].frame_store_id = frame_idx;
> >> - frame_store[j].obj_surface = obj_surface;
> >> - } else {
> >> - WARN_ONCE("Not free slot for DPB reference list!!!\n");
> >> - }
> >> }
> >> }
> >> -
> >> }
> >>
> >> void
> >
> >
More information about the Libva
mailing list