[Intel-gfx] [PATCH i-g-t] lib/gpgpu_fill: Adding missing configuration parameters for gpgpu_fill function

Tue Mar 27 07:43:30 UTC 2018

On Mon, Mar 26, 2018 at 02:28:48PM -0700, Daniele Ceraolo Spurio wrote:
> + igt-dev
> 
> On 21/03/18 07:23, Katarzyna Dec wrote:
> > During debugging gpgpu_fill test on various platforms, I found out few
> > things that can affect newer gens:
> 
> this is slightly confusing, as you start with saying that you've found
> something but then below you start with what you've changed. I'd reword to
> make it clearer what were the issues and how they've been fixed.
>
Actually, this changes will be included in refactoring series and I will
explain everything (I hope) in there.

> >    Seting number of threads in TS in gen8_fill_interface_descriptor to 1.
> > This field was omitted in earlier platforms (apparently without any side
> > effects). Not setting it for newer platforms results in GPU hang.
> >    Adding pipeline selection mask to gen9_gpgpu_fillfunc, which is
> > necessary from SKL.
> >    Changes gen7_emit_interface_descriptor_load to gen8 version.
> 
> This last point applies to the gen9 fillfunc only. I'd also mention that the
> gen8 interface descriptor applies to gen9+ as well for extra clarity.
>
Gen9+ functions will be based on Gen9, so this change will be included there
there as well.
> > 
> > Signed-off-by: Katarzyna Dec <katarzyna.dec at intel.com>
> > Cc: Lukasz Kalamarz <lukasz.kalamarz at intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> > ---
> >   lib/gpgpu_fill.c | 7 +++++--
> >   1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/gpgpu_fill.c b/lib/gpgpu_fill.c
> > index 4d98643d..e7a1edaa 100644
> > --- a/lib/gpgpu_fill.c
> > +++ b/lib/gpgpu_fill.c
> > @@ -336,6 +336,8 @@ gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *
> >   	idd->desc5.constant_urb_entry_read_offset = 0;
> >   	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
> > +	idd->desc6.num_threads_in_tg = 1;
> > +
> 
> 1 seems like a safe value which applies to all platforms and it doesn't
> change execution time (gem_gpgpu_fill still completes in ~0.2s on SKL with
> this change applied) so it seems reasonable to me to use it, but it'd be
> nice to get an ack by someone more experienced with the gpgpu pipeline.
> 
Missing seting number of threads in thread group seems like a 'bug' here.
According to documentation from BDW this field cannot be set to 0. We cannot
be sure that it is not zero unless we set it explicitly.

> >   	return offset;
> >   }
> > @@ -790,12 +792,13 @@ gen9_gpgpu_fillfunc(struct intel_batchbuffer *batch,
> >   	batch->ptr = batch->buffer;
> >   	/* GPGPU pipeline */
> > -	OUT_BATCH(GEN7_PIPELINE_SELECT | PIPELINE_SELECT_GPGPU);
> > +	OUT_BATCH(GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
> > +	PIPELINE_SELECT_GPGPU);
> 
> need more indent here.
> 
> Apart from the various nitpicks the change looks good to me.
> 
> Daniele
> 
Refactoring of these libraries will be a good oportunity to adjust code style :)

Thanks for important feedback!
Kasia
> >   	gen9_emit_state_base_address(batch);
> >   	gen8_emit_vfe_state_gpgpu(batch);
> >   	gen7_emit_curbe_load(batch, curbe_buffer);
> > -	gen7_emit_interface_descriptor_load(batch, interface_descriptor);
> > +	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
> >   	gen8_emit_gpgpu_walk(batch, x, y, width, height);
> >   	OUT_BATCH(MI_BATCH_BUFFER_END);
> >