[Beignet] [PATCH] runtime: fix potential curbe allocation issue.

Mon Jun 30 22:59:48 PDT 2014

On Tue, Jul 01, 2014 at 06:44:18AM +0000, Gong, Zhigang wrote:
> 
> 
> > -----Original Message-----
> > From: Song, Ruiling
> > Sent: Tuesday, July 1, 2014 2:20 PM
> > To: Gong, Zhigang; beignet at lists.freedesktop.org
> > Cc: Gong, Zhigang
> > Subject: RE: [Beignet] [PATCH] runtime: fix potential curbe allocation issue.
> > 
> > 
> > Two comments inline.
> > -----Original Message-----
> > From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of
> > Zhigang Gong
> > Sent: Tuesday, July 01, 2014 12:55 PM
> > To: beignet at lists.freedesktop.org
> > Cc: Gong, Zhigang
> > Subject: [Beignet] [PATCH] runtime: fix potential curbe allocation issue.
> > 
> > According to spec, different platforms have different curbe allocation
> > restrication. The previous code set the curbe allocated size to 480 statically
> > which is not correct.
> > 
> > This patch change to always set the curbe entry num to 64 which is the
> > maximum work group size. And set proper curbe allocation size according to
> > the platform's hard limitation and a relatively reasonable kernel argument
> > usage limitation.
> > 
> > Signed-off-by: Zhigang Gong <zhigang.gong at intel.com>
> > 
> > +#define MAX_KERNEL_ARG_SIZE (32 * 4 + 24 * 4 + 5 * 64) * 64 // 32 integer
> > arguments, 24 uniform special register and 5 vector special register.
> > +
> > 
> > 
> > >>> Seem that MAX_KERNEL_ARG_SIZE you calculate from backend? If we add
> > more payload register, we need also change here?
> This is not elegant current, but not a big deal as the backend is relatively stable now, 
> at least for the special register part. I will put a comment there.
Hold on, I just found when we call intel_gpgpu_load_vfe_state, we already know the kernel,
so we know the specific constant urb size. I will send a new version to fix
the curbe size computation.

> 
> > >>> I think we don't need this, it is OK we simply return the maximum
> > curbe_size provided in hardware here.
> The hardware limitation is too large for most of the kernel. The typical requirement should be about 512 entries.
> But IVB and haswell have 2016 maximum entries, which is 4 times as we need.
> > >> The recommended function name is intel_gpgpu_get_max_curbe_size.
> > 
> Right, intel_gpgpu_xxx is better here. I firstly implement this function in the cl_device_id.c but latter
> I found it's better to be in the driver domain. I will change to use intel_gpgpu_xxx name
> 
> > +LOCAL cl_int
> > +cl_get_max_curbe_size(uint32_t device_id) {
> > +  int max_curbe_size;
> > +  if (IS_BAYTRAIL_T(device_id) ||
> > +      IS_IVB_GT1(device_id))
> > +    max_curbe_size = 992;
> > +  else
> > +    max_curbe_size = 2016;
> > +
> > +  return (max_curbe_size*32) > MAX_KERNEL_ARG_SIZE ?
> > +         (MAX_KERNEL_ARG_SIZE / 32) : max_curbe_size; }
> > +
> > 
> > 
> >    /* curbe_size */
> > -  OUT_BATCH(gpgpu->batch, 480);
> > +  OUT_BATCH(gpgpu->batch,
> > + cl_get_max_curbe_size(gpgpu->drv->device_id));
> > 
> > >>> I think you can set curbe size here as "gpgpu->curb.size_cs_entry *
> > gpgpu->curb.num_cs_entries", what do you think?
> size_cs_entry * num_cs_entries may exceed the limitation. get_max_curbe_size will ensure we set
> a smallest valid size here.
> 
> Any further comment?
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet