[Intel-gfx] [PATCH i-g-t] lib/gpgpu_fill: Adding missing configuration parameters for gpgpu_fill function
katarzyna.dec at intel.com
Tue Mar 27 07:43:30 UTC 2018
On Mon, Mar 26, 2018 at 02:28:48PM -0700, Daniele Ceraolo Spurio wrote:
> + igt-dev
> On 21/03/18 07:23, Katarzyna Dec wrote:
> > During debugging gpgpu_fill test on various platforms, I found out few
> > things that can affect newer gens:
> this is slightly confusing, as you start with saying that you've found
> something but then below you start with what you've changed. I'd reword to
> make it clearer what were the issues and how they've been fixed.
Actually, this changes will be included in refactoring series and I will
explain everything (I hope) in there.
> > Seting number of threads in TS in gen8_fill_interface_descriptor to 1.
> > This field was omitted in earlier platforms (apparently without any side
> > effects). Not setting it for newer platforms results in GPU hang.
> > Adding pipeline selection mask to gen9_gpgpu_fillfunc, which is
> > necessary from SKL.
> > Changes gen7_emit_interface_descriptor_load to gen8 version.
> This last point applies to the gen9 fillfunc only. I'd also mention that the
> gen8 interface descriptor applies to gen9+ as well for extra clarity.
Gen9+ functions will be based on Gen9, so this change will be included there
there as well.
> > Signed-off-by: Katarzyna Dec <katarzyna.dec at intel.com>
> > Cc: Lukasz Kalamarz <lukasz.kalamarz at intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> > ---
> > lib/gpgpu_fill.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> > diff --git a/lib/gpgpu_fill.c b/lib/gpgpu_fill.c
> > index 4d98643d..e7a1edaa 100644
> > --- a/lib/gpgpu_fill.c
> > +++ b/lib/gpgpu_fill.c
> > @@ -336,6 +336,8 @@ gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *
> > idd->desc5.constant_urb_entry_read_offset = 0;
> > idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
> > + idd->desc6.num_threads_in_tg = 1;
> > +
> 1 seems like a safe value which applies to all platforms and it doesn't
> change execution time (gem_gpgpu_fill still completes in ~0.2s on SKL with
> this change applied) so it seems reasonable to me to use it, but it'd be
> nice to get an ack by someone more experienced with the gpgpu pipeline.
Missing seting number of threads in thread group seems like a 'bug' here.
According to documentation from BDW this field cannot be set to 0. We cannot
be sure that it is not zero unless we set it explicitly.
> > return offset;
> > }
> > @@ -790,12 +792,13 @@ gen9_gpgpu_fillfunc(struct intel_batchbuffer *batch,
> > batch->ptr = batch->buffer;
> > /* GPGPU pipeline */
> > - OUT_BATCH(GEN7_PIPELINE_SELECT | PIPELINE_SELECT_GPGPU);
> > + OUT_BATCH(GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
> > + PIPELINE_SELECT_GPGPU);
> need more indent here.
> Apart from the various nitpicks the change looks good to me.
Refactoring of these libraries will be a good oportunity to adjust code style :)
Thanks for important feedback!
> > gen9_emit_state_base_address(batch);
> > gen8_emit_vfe_state_gpgpu(batch);
> > gen7_emit_curbe_load(batch, curbe_buffer);
> > - gen7_emit_interface_descriptor_load(batch, interface_descriptor);
> > + gen8_emit_interface_descriptor_load(batch, interface_descriptor);
> > gen8_emit_gpgpu_walk(batch, x, y, width, height);
> > OUT_BATCH(MI_BATCH_BUFFER_END);
More information about the Intel-gfx