[Libva] [PATCH v3 intel-driver 3/9] decoder: h264: simplify and optimize reference frame store updates.
Gwenole Beauchesne
gb.devel at gmail.com
Thu Jun 5 02:20:28 PDT 2014
2014-06-04 7:30 GMT+02:00 Zhao, Yakui <yakui.zhao at intel.com>:
> On Tue, 2014-06-03 at 22:59 -0600, Gwenole Beauchesne wrote:
>> 2014-06-04 4:11 GMT+02:00 Zhao, Yakui <yakui.zhao at intel.com>:
>> > On Tue, 2014-06-03 at 10:43 -0600, Gwenole Beauchesne wrote:
>> >> Simplify and optimize the update process of the reference frame store.
>> >> Use less iterations to look up existing objects. Use a cache to store
>> >> the free'd slots.
>> >>
>> >> Prerequisite: the reference_objects[] array was previously arranged in
>> >> a way that the element at index i is exactly the object_surface that
>> >> corresponds to the VA surface identified by the VAPictureH264.picture_id
>> >> located at index i in the ReferenceFrames[] array.
>> >
>> > As a whole, this patch looks good to me. Based on the hardware
>> > requirement, it will be better that one reference picture later used is
>> > not assigned for the other purpose even when it is not used in current
>> > reference frame list. In such case this patch does some optimization
>> > about DPB, which can defer the usage of one picture later used that is
>> > not in current reference frame.especially when: one picture is not used
>> > during decoding frame X while it is used during decoding frame X + 1.
>>
>> The case you mention was supposed to be fixed with the other patch
>> series where the re-factorisation is a pre-requisite. :)
>> A very strict GenFrameStoreContext needs to be maintained with the set
>> of used_frames[] as it is today, and an additional set of
>> free_frames[] *candidates*. Though, this case is not covered by the
>> conformance suite, and I haven't found any real world case exhibiting
>> that, neither did I try to generate one yet. So we can defer this one.
>>
>> > Although it still has no improvment for the following scenario, it still
>> > is an improvement.
>> > >One picture is not used during decoding frame X and X + 1 while it
>> > is used during decoding frame X + 2.
>> >
>> > Of course the perfect solution is that the upper layer can assure that
>> > one picture later used is still added into the DPB of current frame
>> > although it is not used in current reference list.
>>
>> No, this is not going to happen. This would be an implementation error
>> of the standard.
>
> In theory it is no problem for normal H264. But it is true for MVC case
> as some reference frames for view 1 is not used when decoding view 0.
When I said this was not going to happen, this concerned the "perfect
solution" you proposed, which is not one. That would be a workaround
to driver bugs, and an error of implementation of the standard. Of
course, the DPB holds all the necessary view components, but the
VAPictureH264.ReferenceFrames[] is not exactly the DPB, and it cannot
be.
The case you mention is what I said a paragraph above, and the only
"perfect" way to solve it is to correctly track the references in the
driver side. This was what the GenFrameStoreContext was for. However,
as I mentioned, there is no supporting case in the conformance test
suite, and I haven't come by a real-world sample exhibiting this.
>> >> Theory of operations:
>> >>
>> >> 1. Obsolete entries are removed first, i.e. entries in the internal DPB
>> >> that no longer have a match in the supplied ReferenceFrames[] array.
>> >> That obsolete entry index is stored in a local cache: free_slots[].
>> >>
>> >> 2. This cache is completed with entries considered as "invalid" or "not
>> >> present", sequentially while traversing the frame store for obsolete
>> >> entries. At the end of this removal process, the free_slots[] array
>> >> represents all possible indices in there that could be re-used for
>> >> new reference frames to track.
>> >>
>> >> 3. The list of ReferenceFrames[] objects is traversed for new entries
>> >> that are not already in the frame store. If an entry needs to be
>> >> added, it is placed at the index obtained from the next free_slots[]
>> >> element. There is no need to traverse the frame store array again,
>> >> the next available slot can be known from that free_slots[] cache.
>> >>
>> >> Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne at intel.com>
>> >> ---
>> >> src/i965_decoder_utils.c | 107 ++++++++++++++++++++--------------------------
>> >> 1 file changed, 46 insertions(+), 61 deletions(-)
>> >>
>> >> diff --git a/src/i965_decoder_utils.c b/src/i965_decoder_utils.c
>> >> index 2570be9..c686235 100644
>> >> --- a/src/i965_decoder_utils.c
>> >> +++ b/src/i965_decoder_utils.c
>> >> @@ -418,88 +418,73 @@ gen6_send_avc_ref_idx_state(
>> >> }
>> >>
>> >> void
>> >> -intel_update_avc_frame_store_index(VADriverContextP ctx,
>> >> - struct decode_state *decode_state,
>> >> - VAPictureParameterBufferH264 *pic_param,
>> >> - GenFrameStore frame_store[MAX_GEN_REFERENCE_FRAMES])
>> >> +intel_update_avc_frame_store_index(
>> >> + VADriverContextP ctx,
>> >> + struct decode_state *decode_state,
>> >> + VAPictureParameterBufferH264 *pic_param,
>> >> + GenFrameStore frame_store[MAX_GEN_REFERENCE_FRAMES]
>> >> +)
>> >> {
>> >> - int i, j;
>> >> -
>> >> - assert(MAX_GEN_REFERENCE_FRAMES == ARRAY_ELEMS(pic_param->ReferenceFrames));
>> >> -
>> >> - for (i = 0; i < MAX_GEN_REFERENCE_FRAMES; i++) {
>> >> - int found = 0;
>> >> -
>> >> - if (frame_store[i].surface_id == VA_INVALID_ID ||
>> >> - frame_store[i].obj_surface == NULL)
>> >> + int free_slots[MAX_GEN_REFERENCE_FRAMES];
>> >> + int i, j, n, num_free_slots;
>> >> + bool found;
>> >> +
>> >
>> > It will be better to fill the free_slots through the following two loops
>> > instead of one loop, which is to try to defer the usage of the reference
>> > picture later used that is not used in current reference frame list.
>> > a. check the invalid surface_id or NULL surface and then add it into
>> > free_slots.
>> > b. check whether one picture is used by the current reference frame
>> > list. If not, add it into the free_slots.
>>
>> The previous algorithm was filling in the holes. This one does the
>> same. I don't see where your proposal could bring an improvement?
>
> In your patch you use the free_slots to find one free slot to hold a new
> reference picture in DPB. Right? It will firstly try to find one
> free_slot with invalid picture_id or surface, which is used to hold a
> new reference picture in DPB.
No, it will take whatever comes first between:
- a slot with an invalid picture_id or surface ;
- or a slot with a retired reference picture candidate, i.e. something
that cannot be found in the ReferenceFrames[] array.
And this works exactly the same way as before. No functional change yet.
> It can remove the possibility that one
> picture later used is immediately used to hold a new reference picture
> in DPB. So based on the above point IMO it is better to find the
> free_slots through two steps to hold the new reference picture in DPB.
This would only hide further this situation, but OK.
Regards,
Gwenole.
>> >> + /* Remove obsolete entries from the internal DPB */
>> >> + for (i = 0, n = 0; i < MAX_GEN_REFERENCE_FRAMES; i++) {
>> >> + GenFrameStore * const fs = &frame_store[i];
>> >> + if (fs->surface_id == VA_INVALID_ID || !fs->obj_surface) {
>> >> + free_slots[n++] = i;
>> >> continue;
>> >> + }
>> >>
>> >> - assert(frame_store[i].frame_store_id != -1);
>> >> -
>> >> - for (j = 0; j < MAX_GEN_REFERENCE_FRAMES; j++) {
>> >> - VAPictureH264 *ref_pic = &pic_param->ReferenceFrames[j];
>> >> - if (ref_pic->flags & VA_PICTURE_H264_INVALID)
>> >> + // Find whether the current entry is still a valid reference frame
>> >> + found = false;
>> >> + for (j = 0; j < ARRAY_ELEMS(decode_state->reference_objects); j++) {
>> >> + struct object_surface * const obj_surface =
>> >> + decode_state->reference_objects[j];
>> >> + if (!obj_surface)
>> >> continue;
>> >> -
>> >> - if (frame_store[i].surface_id == ref_pic->picture_id) {
>> >> - found = 1;
>> >> + if ((found = obj_surface == fs->obj_surface))
>> >> break;
>> >> - }
>> >> }
>> >>
>> >> - /* remove it from the internal DPB */
>> >> + // ... or remove it
>> >> if (!found) {
>> >> - frame_store[i].surface_id = VA_INVALID_ID;
>> >> - frame_store[i].frame_store_id = -1;
>> >> - frame_store[i].obj_surface = NULL;
>> >> + fs->surface_id = VA_INVALID_ID;
>> >> + fs->obj_surface = NULL;
>> >> + fs->frame_store_id = -1;
>> >> + free_slots[n++] = i;
>> >> }
>> >> }
>> >> + num_free_slots = n;
>> >>
>> >> - for (i = 0; i < MAX_GEN_REFERENCE_FRAMES; i++) {
>> >> - VAPictureH264 *ref_pic = &pic_param->ReferenceFrames[i];
>> >> - int found = 0;
>> >> -
>> >> - if (ref_pic->flags & VA_PICTURE_H264_INVALID ||
>> >> - ref_pic->picture_id == VA_INVALID_SURFACE ||
>> >> - decode_state->reference_objects[i] == NULL)
>> >> + /* Append the new reference frames */
>> >> + for (i = 0, n = 0; i < ARRAY_ELEMS(decode_state->reference_objects); i++) {
>> >> + struct object_surface * const obj_surface =
>> >> + decode_state->reference_objects[i];
>> >> + if (!obj_surface)
>> >> continue;
>> >>
>> >> + // Find whether the current frame is not already in our frame store
>> >> + found = false;
>> >> for (j = 0; j < MAX_GEN_REFERENCE_FRAMES; j++) {
>> >> - if (frame_store[j].surface_id == ref_pic->picture_id) {
>> >> - found = 1;
>> >> + if ((found = frame_store[j].obj_surface == obj_surface))
>> >> break;
>> >> - }
>> >> }
>> >>
>> >> - /* add the new reference frame into the internal DPB */
>> >> + // ... or add it
>> >> if (!found) {
>> >> - int frame_idx;
>> >> - int slot_found;
>> >> - struct object_surface *obj_surface = decode_state->reference_objects[i];
>> >> -
>> >> - slot_found = 0;
>> >> - frame_idx = -1;
>> >> - /* Find a free frame store index */
>> >> - for (j = 0; j < MAX_GEN_REFERENCE_FRAMES; j++) {
>> >> - if (frame_store[j].surface_id == VA_INVALID_ID ||
>> >> - frame_store[j].obj_surface == NULL) {
>> >> - frame_idx = j;
>> >> - slot_found = 1;
>> >> - break;
>> >> - }
>> >> + if (n < num_free_slots) {
>> >> + GenFrameStore * const fs = &frame_store[free_slots[n++]];
>> >> + fs->surface_id = obj_surface->base.id;
>> >> + fs->obj_surface = obj_surface;
>> >> + fs->frame_store_id = fs - frame_store;
>> >> + }
>> >> + else {
>> >> + WARN_ONCE("No free slot found for DPB reference list!!!\n");
>> >> }
>> >> -
>> >> -
>> >> - if (slot_found) {
>> >> - frame_store[j].surface_id = ref_pic->picture_id;
>> >> - frame_store[j].frame_store_id = frame_idx;
>> >> - frame_store[j].obj_surface = obj_surface;
>> >> - } else {
>> >> - WARN_ONCE("Not free slot for DPB reference list!!!\n");
>> >> - }
>> >> }
>> >> }
>> >> -
>> >> }
>> >>
>> >> void
>> >
>> >
>
>
More information about the Libva
mailing list