[Libva] [Libva-intel-driver][PATCH 02/17] Move all curbe related settings to the inner structure in i965_gpe_context

Xiang, Haihao haihao.xiang at intel.com
Fri Nov 18 01:42:46 UTC 2016


> On 11/18/2016 12:58 AM, Xiang, Haihao wrote:
> > 
> > 
> > > -----Original Message-----
> > > From: Zhao, Yakui
> > > Sent: Thursday, November 17, 2016 8:42 PM
> > > To: Xiang, Haihao<haihao.xiang at intel.com>
> > > Cc: libva at lists.freedesktop.org
> > > Subject: Re: [Libva] [Libva-intel-driver][PATCH 02/17] Move all
> > > curbe related
> > > settings to the inner structure in i965_gpe_context
> > > 
> > > On 11/17/2016 04:35 PM, Xiang, Haihao wrote:
> > > > To avoid confusion between curbe.length and curbe_size, this
> > > > patch
> > > > uses curbe.length only. curbe.bo is always set even if curbe is
> > > > a part
> > > > of the dynamic state buffer, hence we can use curbe related
> > > > settings
> > > > no matter it is a part of the dynamic state buffer or not.
> > > 
> > > The curbe.bo in *_gpe_context is defined/used for the old
> > > platform.
> > > In fact for the platform from Sandybridge, it can reside in the
> > > dynamic_state.bo.
> > 
> > > 
> > > If the curbe.bo/curbe.offset is used directly,
> > 
> > previously curbe_offset is also used directly.
> > 
> > > maybe it brings the confusion
> > > that one dedicated bo is declared/defined for curbe.
> > 
> > curbe.bo/curbe.offset is set in gpe functions, gpe user just uses
> > it and needn't care about
> > it is a dedicated bo or not.
> > 
> > > 
> > > So I think that the curbe_offset can follow the HW spec.
> > 
> > curbe.offset still follows the HW spec.  The problem for the old
> > structure is that we have an inner curbe structure
> > and curbe_offset/ curbe_size in the same structure.  It is easy to
> > confuse a new developer.
> 
> The inner curbe structure is mainly defined for the old platform.

Actually it can be used on the new platform. 

> For the HW that should allocate the curbe_buffer from dynamic_buffer,

curbo.bo points to the dynamic buffer on the new platform. 

>  
> IMO it only needs to care the dynamic_buffer and curbe_offset when it
> needs to access the curbe_buffer.

My point here is gpe user only cares curbe when using curbe, no matter
it is part of dynamic buffer or a dedicated buffer, so we can use the
same code to use curbe buffer, no matter the HW is old or new, 

e.g.
in i965_gpe_context_map_curbe()

dri_bo_map(gpe_context->curbe.bo, 1) works for all platform. 

otherwise we have to use if ... else.

> 
> Maybe we can add some explanations about the i965_gpe_context.
>     >The inner structure(idrt, sampler, curbe) is mainly defined for
> the 
> old platform. And the dedicated bo is allocated for them.
>     >dynamic_buffer/curbe_offset is for the later platform.

We should avoid writing too many comments if the code is clean enough.

> 
> 
> > 
> > > Maybe we can add the wrapper function that can map/unmap the
> > > virtual
> > > address of curbe_buffer. In such case it can also simplify the
> > > mapping related
> > > with curbe_buffer.
> > 
> > We have already such functions, but we don't use them for some old
> > platforms.  We can add similar functions for idrt and sampler.
> 
> For the wrapper, IMO we can only consider them for the platform since
> gen8+. We can leave the old platform alone.
> 
> > 
> > > 
> > > Similar considerations for Interface_descriptor_data,
> > > sampler_buffer.
> > > 
> > > 
> > > > 
> > > > Signed-off-by: Xiang, Haihao<haihao.xiang at intel.com>
> > > > ---
> > > >    src/gen75_vpp_gpe.c        |  2 +-
> > > >    src/gen8_mfc.c             |  2 +-
> > > >    src/gen8_vme.c             | 12 ++++++------
> > > >    src/gen9_post_processing.c |  3 +--
> > > >    src/gen9_vme.c             | 12 ++++++------
> > > >    src/gen9_vp9_encoder.c     | 22 ++++++++++------------
> > > >    src/i965_gpe_utils.c       | 28 +++++++++++++++++++---------
> > > >    src/i965_gpe_utils.h       |  3 +--
> > > >    8 files changed, 45 insertions(+), 39 deletions(-)
> > > > 
> > > > diff --git a/src/gen75_vpp_gpe.c b/src/gen75_vpp_gpe.c index
> > > > 9850c1c..2cddb5a 100644
> > > > --- a/src/gen75_vpp_gpe.c
> > > > +++ b/src/gen75_vpp_gpe.c
> > > > @@ -890,7 +890,7 @@ vpp_gpe_context_init(VADriverContextP ctx)
> > > >            gpe_ctx->surface_state_binding_table.length =
> > > >                   (SURFACE_STATE_PADDED_SIZE_GEN8 +
> > > > sizeof(unsigned
> > > > int)) * MAX_MEDIA_SURFACES_GEN6;
> > > > 
> > > > -        gpe_ctx->curbe_size = CURBE_TOTAL_DATA_LENGTH;
> > > > +        gpe_ctx->curbe.length = CURBE_TOTAL_DATA_LENGTH;
> > > >            gpe_ctx->idrt_size  = sizeof(struct
> > > > gen8_interface_descriptor_data) * MAX_INTERFACE_DESC_GEN6;
> > > > 
> > > >        }
> > > > diff --git a/src/gen8_mfc.c b/src/gen8_mfc.c index
> > > > 63ffea5..3ed9e84
> > > > 100644
> > > > --- a/src/gen8_mfc.c
> > > > +++ b/src/gen8_mfc.c
> > > > @@ -4609,7 +4609,7 @@ Bool
> > > > gen8_mfc_context_init(VADriverContextP
> > > ctx, struct intel_encoder_context *e
> > > >        mfc_context-
> > > > >gpe_context.surface_state_binding_table.length =
> > > > (SURFACE_STATE_PADDED_SIZE + sizeof(unsigned int)) *
> > > > MAX_MEDIA_SURFACES_GEN6;
> > > > 
> > > >        mfc_context->gpe_context.idrt_size = sizeof(struct
> > > gen8_interface_descriptor_data) * MAX_INTERFACE_DESC_GEN6;
> > > > -    mfc_context->gpe_context.curbe_size = 32 * 4;
> > > > +    mfc_context->gpe_context.curbe.length = 32 * 4;
> > > >        mfc_context->gpe_context.sampler_size = 0;
> > > > 
> > > >        mfc_context->gpe_context.vfe_state.max_num_threads = 60
> > > > - 1;
> > > > diff --git a/src/gen8_vme.c b/src/gen8_vme.c index
> > > > c79c62b..96835bf
> > > > 100644
> > > > --- a/src/gen8_vme.c
> > > > +++ b/src/gen8_vme.c
> > > > @@ -389,10 +389,10 @@ static VAStatus
> > > > gen8_vme_constant_setup(VADriverContextP ctx,
> > > > 
> > > >        vme_state_message[31] = mv_num;
> > > > 
> > > > -    dri_bo_map(vme_context->gpe_context.dynamic_state.bo, 1);
> > > > -    assert(vme_context->gpe_context.dynamic_state.bo-
> > > > >virtual);
> > > > -    constant_buffer = (unsigned char *)vme_context-
> > > > gpe_context.dynamic_state.bo->virtual +
> > > > -                                         vme_context-
> > > > >gpe_context.curbe_offset;
> > > > +    dri_bo_map(vme_context->gpe_context.curbe.bo, 1);
> > > > +    assert(vme_context->gpe_context.curbe.bo->virtual);
> > > > +    constant_buffer = (unsigned char *)vme_context-
> > > > gpe_context.curbe.bo->virtual +
> > > > +
> > > > + vme_context->gpe_context.curbe.offset;
> > > > 
> > > >        /* VME MV/Mb cost table is passed by using const buffer
> > > > */
> > > >        /* Now it uses the fixed search path. So it is
> > > > constructed
> > > > directly @@ -400,7 +400,7 @@ static VAStatus
> > > gen8_vme_constant_setup(VADriverContextP ctx,
> > > >         */
> > > >        memcpy(constant_buffer, (char *)vme_context-
> > > > >vme_state_message,
> > > > 128);
> > > > 
> > > > -    dri_bo_unmap(vme_context->gpe_context.dynamic_state.bo);
> > > > +    dri_bo_unmap(vme_context->gpe_context.curbe.bo);
> > > > 
> > > >        return VA_STATUS_SUCCESS;
> > > >    }
> > > > @@ -1379,7 +1379,7 @@ Bool
> > > > gen8_vme_context_init(VADriverContextP
> > > ctx, struct intel_encoder_context *e
> > > >            vme_context-
> > > > >gpe_context.surface_state_binding_table.length
> > > > = (SURFACE_STATE_PADDED_SIZE + sizeof(unsigned int)) *
> > > > MAX_MEDIA_SURFACES_GEN6;
> > > > 
> > > >            vme_context->gpe_context.idrt_size = sizeof(struct
> > > gen8_interface_descriptor_data) * MAX_INTERFACE_DESC_GEN6;
> > > > -        vme_context->gpe_context.curbe_size =
> > > CURBE_TOTAL_DATA_LENGTH;
> > > > +        vme_context->gpe_context.curbe.length =
> > > > + CURBE_TOTAL_DATA_LENGTH;
> > > >            vme_context->gpe_context.sampler_size = 0;
> > > > 
> > > > 
> > > > diff --git a/src/gen9_post_processing.c
> > > > b/src/gen9_post_processing.c
> > > > index a5d345c..71da501 100644
> > > > --- a/src/gen9_post_processing.c
> > > > +++ b/src/gen9_post_processing.c
> > > > @@ -538,8 +538,7 @@
> > > gen9_post_processing_context_init(VADriverContextP ctx,
> > > >        gen8_gpe_load_kernels(ctx, gpe_context,&scaling_kernel,
> > > > 1);
> > > >        gpe_context->idrt_size = ALIGN(sizeof(struct
> > > gen8_interface_descriptor_data), 64);
> > > >        gpe_context->sampler_size = ALIGN(sizeof(struct
> > > > gen8_sampler_state),
> > > 64);
> > > > -    gpe_context->curbe_size = ALIGN(sizeof(struct
> > > scaling_input_parameter), 64);
> > > > -    gpe_context->curbe.length = gpe_context->curbe_size;
> > > > +    gpe_context->curbe.length = ALIGN(sizeof(struct
> > > > + scaling_input_parameter), 64);
> > > > 
> > > >        gpe_context->surface_state_binding_table.max_entries =
> > > MAX_SCALING_SURFACES;
> > > >        gpe_context-
> > > > >surface_state_binding_table.binding_table_offset =
> > > > 0; diff --git a/src/gen9_vme.c b/src/gen9_vme.c index
> > > > 6ad8fff..a59fe2a
> > > > 100644
> > > > --- a/src/gen9_vme.c
> > > > +++ b/src/gen9_vme.c
> > > > @@ -438,10 +438,10 @@ static VAStatus
> > > > gen9_vme_constant_setup(VADriverContextP ctx,
> > > > 
> > > >        vme_state_message[31] = mv_num;
> > > > 
> > > > -    dri_bo_map(vme_context->gpe_context.dynamic_state.bo, 1);
> > > > -    assert(vme_context->gpe_context.dynamic_state.bo-
> > > > >virtual);
> > > > -    constant_buffer = (unsigned char *)vme_context-
> > > > gpe_context.dynamic_state.bo->virtual +
> > > > -                                         vme_context-
> > > > >gpe_context.curbe_offset;
> > > > +    dri_bo_map(vme_context->gpe_context.curbe.bo, 1);
> > > > +    assert(vme_context->gpe_context.curbe.bo->virtual);
> > > > +    constant_buffer = (unsigned char *)vme_context-
> > > > gpe_context.curbe.bo->virtual +
> > > > +
> > > > + vme_context->gpe_context.curbe.offset;
> > > > 
> > > >        /* VME MV/Mb cost table is passed by using const buffer
> > > > */
> > > >        /* Now it uses the fixed search path. So it is
> > > > constructed
> > > > directly @@ -449,7 +449,7 @@ static VAStatus
> > > gen9_vme_constant_setup(VADriverContextP ctx,
> > > >         */
> > > >        memcpy(constant_buffer, (char *)vme_context-
> > > > >vme_state_message,
> > > > 128);
> > > > 
> > > > -    dri_bo_unmap(vme_context->gpe_context.dynamic_state.bo);
> > > > +    dri_bo_unmap(vme_context->gpe_context.curbe.bo);
> > > > 
> > > >        return VA_STATUS_SUCCESS;
> > > >    }
> > > > @@ -2032,7 +2032,7 @@ Bool
> > > > gen9_vme_context_init(VADriverContextP
> > > ctx, struct intel_encoder_context *e
> > > >        vme_context-
> > > > >gpe_context.surface_state_binding_table.length =
> > > > (SURFACE_STATE_PADDED_SIZE + sizeof(unsigned int)) *
> > > > MAX_MEDIA_SURFACES_GEN6;
> > > > 
> > > >        vme_context->gpe_context.idrt_size = sizeof(struct
> > > gen8_interface_descriptor_data) * MAX_INTERFACE_DESC_GEN6;
> > > > -    vme_context->gpe_context.curbe_size =
> > > > CURBE_TOTAL_DATA_LENGTH;
> > > > +    vme_context->gpe_context.curbe.length =
> > > CURBE_TOTAL_DATA_LENGTH;
> > > >        vme_context->gpe_context.sampler_size = 0;
> > > > 
> > > > 
> > > > diff --git a/src/gen9_vp9_encoder.c b/src/gen9_vp9_encoder.c
> > > > index
> > > > f39d6d0..5ad7b26 100644
> > > > --- a/src/gen9_vp9_encoder.c
> > > > +++ b/src/gen9_vp9_encoder.c
> > > > @@ -1820,18 +1820,18 @@
> > > gen9_brc_update_add_surfaces_vp9(VADriverContextP ctx,
> > > >        /* 4. Mbenc curbe input buffer */
> > > >        gen9_add_dri_buffer_gpe_surface(ctx,
> > > >                                        brc_gpe_context,
> > > > -                                    mbenc_gpe_context-
> > > > >dynamic_state.bo,
> > > > +                                    mbenc_gpe_context-
> > > > >curbe.bo,
> > > >                                        0,
> > > > -                                    ALIGN(mbenc_gpe_context-
> > > > >curbe_size, 64),
> > > > -                                    mbenc_gpe_context-
> > > > >curbe_offset,
> > > > +                                    ALIGN(mbenc_gpe_context-
> > > > >curbe.length, 64),
> > > > +                                    mbenc_gpe_context-
> > > > >curbe.offset,
> > > >                                        VP9_BTI_BRC_MBENC_CURBE_
> > > > INPUT_G9);
> > > >        /* 5. Mbenc curbe output buffer */
> > > >        gen9_add_dri_buffer_gpe_surface(ctx,
> > > >                                        brc_gpe_context,
> > > > -                                    mbenc_gpe_context-
> > > > >dynamic_state.bo,
> > > > +                                    mbenc_gpe_context-
> > > > >curbe.bo,
> > > >                                        0,
> > > > -                                    ALIGN(mbenc_gpe_context-
> > > > >curbe_size, 64),
> > > > -                                    mbenc_gpe_context-
> > > > >curbe_offset,
> > > > +                                    ALIGN(mbenc_gpe_context-
> > > > >curbe.length, 64),
> > > > +                                    mbenc_gpe_context-
> > > > >curbe.offset,
> > > > 
> > > > VP9_BTI_BRC_MBENC_CURBE_OUTPUT_G9);
> > > > 
> > > >        /* 6. BRC_PIC_STATE read buffer */ @@ -3289,10 +3289,10
> > > > @@
> > > > gen9_vp9_send_mbenc_surface(VADriverContextP ctx,
> > > > 
> > > >            gen9_add_dri_buffer_gpe_surface(ctx,
> > > >                                            gpe_context,
> > > > -                                        mbenc_param-
> > > > >gpe_context_tx->dynamic_state.bo,
> > > > +
> > > > + mbenc_param->gpe_context_tx->curbe.bo,
> > > >                                            0,
> > > >                                            ALIGN(res_size, 64),
> > > > -                                        mbenc_param-
> > > > >gpe_context_tx->curbe_offset,
> > > > +
> > > > + mbenc_param->gpe_context_tx->curbe.offset,
> > > >                                            VP9_BTI_MBENC_TX_CUR
> > > > BE_G9);
> > > > 
> > > >            break;
> > > > @@ -3441,10 +3441,10 @@
> > > gen9_vp9_send_mbenc_surface(VADriverContextP
> > > > ctx,
> > > > 
> > > >            gen9_add_dri_buffer_gpe_surface(ctx,
> > > >                                            gpe_context,
> > > > -                                        mbenc_param-
> > > > >gpe_context_tx->dynamic_state.bo,
> > > > +
> > > > + mbenc_param->gpe_context_tx->curbe.bo,
> > > >                                            0,
> > > >                                            ALIGN(res_size, 64),
> > > > -                                        mbenc_param-
> > > > >gpe_context_tx->curbe_offset,
> > > > +
> > > > + mbenc_param->gpe_context_tx->curbe.offset,
> > > >                                            VP9_BTI_MBENC_TX_CUR
> > > > BE_G9);
> > > > 
> > > > 
> > > > @@ -3684,8 +3684,6 @@ gen9_init_gpe_context_vp9(struct
> > > i965_gpe_context *gpe_context,
> > > >    {
> > > >        gpe_context->curbe.length = kernel_param->curbe_size; //
> > > > in
> > > > bytes
> > > > 
> > > > -    gpe_context->curbe_size = ALIGN(kernel_param->curbe_size,
> > > > 64);
> > > > -
> > > >        gpe_context->sampler_size = 0;
> > > >        if (kernel_param->sampler_size) {
> > > >            gpe_context->sampler_size =
> > > > ALIGN(kernel_param->sampler_size, 64); diff --git
> > > > a/src/i965_gpe_utils.c b/src/i965_gpe_utils.c index
> > > > c5a8935..3739a88
> > > > 100644
> > > > --- a/src/i965_gpe_utils.c
> > > > +++ b/src/i965_gpe_utils.c
> > > > @@ -1066,8 +1066,8 @@ gen8_gpe_curbe_load(VADriverContextP ctx,
> > > > 
> > > >        OUT_BATCH(batch, CMD_MEDIA_CURBE_LOAD | (4 - 2));
> > > >        OUT_BATCH(batch, 0);
> > > > -    OUT_BATCH(batch, gpe_context->curbe_size);
> > > > -    OUT_BATCH(batch, gpe_context->curbe_offset);
> > > > +    OUT_BATCH(batch, gpe_context->curbe.length);
> > > > +    OUT_BATCH(batch, gpe_context->curbe.offset);
> > > > 
> > > >        ADVANCE_BATCH(batch);
> > > >    }
> > > > @@ -1122,7 +1122,7 @@ gen8_gpe_context_init(VADriverContextP
> > > > ctx,
> > > >        assert(bo);
> > > >        gpe_context->surface_state_binding_table.bo = bo;
> > > > 
> > > > -    bo_size = gpe_context->idrt_size + gpe_context->curbe_size 
> > > > +
> > > gpe_context->sampler_size + 192;
> > > > +    bo_size = gpe_context->idrt_size + gpe_context-
> > > > >curbe.length +
> > > > + gpe_context->sampler_size + 192;
> > > >        dri_bo_unreference(gpe_context->dynamic_state.bo);
> > > >        bo = dri_bo_alloc(i965->intel.bufmgr,
> > > >                          "surface state&   binding table", @@
> > > > -1137,8
> > > > +1137,11 @@ gen8_gpe_context_init(VADriverContextP ctx,
> > > > 
> > > >        /* Constant buffer offset */
> > > >        start_offset = ALIGN(end_offset, 64);
> > > > -    gpe_context->curbe_offset = start_offset;
> > > > -    end_offset = start_offset + gpe_context->curbe_size;
> > > > +    dri_bo_unreference(gpe_context->curbe.bo);
> > > > +    gpe_context->curbe.bo = bo;
> > > > +    dri_bo_reference(gpe_context->curbe.bo);
> > > > +    gpe_context->curbe.offset = start_offset;
> > > > +    end_offset = start_offset + gpe_context->curbe.length;
> > > > 
> > > >        /* Interface descriptor offset */
> > > >        start_offset = ALIGN(end_offset, 64); @@ -1170,6 +1173,8
> > > > @@
> > > > gen8_gpe_context_destroy(struct i965_gpe_context *gpe_context)
> > > >        dri_bo_unreference(gpe_context->indirect_state.bo);
> > > >        gpe_context->indirect_state.bo = NULL;
> > > > 
> > > > +    dri_bo_unreference(gpe_context->curbe.bo);
> > > > +    gpe_context->curbe.bo = NULL;
> > > >    }
> > > > 
> > > > 
> > > > @@ -1619,7 +1624,12 @@
> > > gen8_gpe_context_set_dynamic_buffer(VADriverContextP ctx,
> > > >        dri_bo_reference(gpe_context->dynamic_state.bo);
> > > >        gpe_context->dynamic_state.bo_size = ds->bo_size;
> > > > 
> > > > -    gpe_context->curbe_offset = ds->curbe_offset;
> > > > +    /* curbe buffer is a part of the dynamic buffer */
> > > > +    dri_bo_unreference(gpe_context->curbe.bo);
> > > > +    gpe_context->curbe.bo = ds->bo;
> > > > +    dri_bo_reference(gpe_context->curbe.bo);
> > > > +    gpe_context->curbe.offset = ds->curbe_offset;
> > > > +
> > > >        gpe_context->idrt_offset = ds->idrt_offset;
> > > >        gpe_context->sampler_offset = ds->sampler_offset;
> > > > 
> > > > @@ -1629,15 +1639,15 @@
> > > gen8_gpe_context_set_dynamic_buffer(VADriverContextP ctx,
> > > >    void *
> > > >    gen8p_gpe_context_map_curbe(struct i965_gpe_context
> > > > *gpe_context)
> > > >    {
> > > > -    dri_bo_map(gpe_context->dynamic_state.bo, 1);
> > > > +    dri_bo_map(gpe_context->curbe.bo, 1);
> > > > 
> > > > -    return (char *)gpe_context->dynamic_state.bo->virtual +
> > > > gpe_context-
> > > > curbe_offset;
> > > > +    return (char *)gpe_context->curbe.bo->virtual +
> > > > + gpe_context->curbe.offset;
> > > >    }
> > > > 
> > > >    void
> > > >    gen8p_gpe_context_unmap_curbe(struct i965_gpe_context
> > > *gpe_context)
> > > >    {
> > > > -    dri_bo_unmap(gpe_context->dynamic_state.bo);
> > > > +    dri_bo_unmap(gpe_context->curbe.bo);
> > > >    }
> > > > 
> > > >    void
> > > > diff --git a/src/i965_gpe_utils.h b/src/i965_gpe_utils.h index
> > > > 0cbef43..92123fe 100644
> > > > --- a/src/i965_gpe_utils.h
> > > > +++ b/src/i965_gpe_utils.h
> > > > @@ -92,6 +92,7 @@ struct i965_gpe_context
> > > >        struct {
> > > >            dri_bo *bo;
> > > >            unsigned int length;            /* in bytes */
> > > > +        unsigned int offset;
> > > >        } curbe;
> > > > 
> > > >        struct {
> > > > @@ -168,8 +169,6 @@ struct i965_gpe_context
> > > >        int sampler_size;
> > > >        unsigned int idrt_offset;
> > > >        int idrt_size;
> > > > -    unsigned int curbe_offset;
> > > > -    int curbe_size;
> > > >    };
> > > > 
> > > >    struct gpe_mi_flush_dw_parameter
> > 
> 


More information about the Libva mailing list