[Beignet] [PATCH] runtime: fix potential curbe allocation issue.

Gong, Zhigang zhigang.gong at intel.com
Mon Jun 30 23:44:18 PDT 2014



> -----Original Message-----
> From: Song, Ruiling
> Sent: Tuesday, July 1, 2014 2:20 PM
> To: Gong, Zhigang; beignet at lists.freedesktop.org
> Cc: Gong, Zhigang
> Subject: RE: [Beignet] [PATCH] runtime: fix potential curbe allocation issue.
> 
> 
> Two comments inline.
> -----Original Message-----
> From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of
> Zhigang Gong
> Sent: Tuesday, July 01, 2014 12:55 PM
> To: beignet at lists.freedesktop.org
> Cc: Gong, Zhigang
> Subject: [Beignet] [PATCH] runtime: fix potential curbe allocation issue.
> 
> According to spec, different platforms have different curbe allocation
> restrication. The previous code set the curbe allocated size to 480 statically
> which is not correct.
> 
> This patch change to always set the curbe entry num to 64 which is the
> maximum work group size. And set proper curbe allocation size according to
> the platform's hard limitation and a relatively reasonable kernel argument
> usage limitation.
> 
> Signed-off-by: Zhigang Gong <zhigang.gong at intel.com>
> 
> +#define MAX_KERNEL_ARG_SIZE (32 * 4 + 24 * 4 + 5 * 64) * 64 // 32 integer
> arguments, 24 uniform special register and 5 vector special register.
> +
> 
> 
> >>> Seem that MAX_KERNEL_ARG_SIZE you calculate from backend? If we add
> more payload register, we need also change here?
This is not elegant current, but not a big deal as the backend is relatively stable now, 
at least for the special register part. I will put a comment there.

> >>> I think we don't need this, it is OK we simply return the maximum
> curbe_size provided in hardware here.
The hardware limitation is too large for most of the kernel. The typical requirement should be about 512 entries.
But IVB and haswell have 2016 maximum entries, which is 4 times as we need.
> >> The recommended function name is intel_gpgpu_get_max_curbe_size.
> 
Right, intel_gpgpu_xxx is better here. I firstly implement this function in the cl_device_id.c but latter
I found it's better to be in the driver domain. I will change to use intel_gpgpu_xxx name

> +LOCAL cl_int
> +cl_get_max_curbe_size(uint32_t device_id) {
> +  int max_curbe_size;
> +  if (IS_BAYTRAIL_T(device_id) ||
> +      IS_IVB_GT1(device_id))
> +    max_curbe_size = 992;
> +  else
> +    max_curbe_size = 2016;
> +
> +  return (max_curbe_size*32) > MAX_KERNEL_ARG_SIZE ?
> +         (MAX_KERNEL_ARG_SIZE / 32) : max_curbe_size; }
> +
> 
> 
>    /* curbe_size */
> -  OUT_BATCH(gpgpu->batch, 480);
> +  OUT_BATCH(gpgpu->batch,
> + cl_get_max_curbe_size(gpgpu->drv->device_id));
> 
> >>> I think you can set curbe size here as "gpgpu->curb.size_cs_entry *
> gpgpu->curb.num_cs_entries", what do you think?
size_cs_entry * num_cs_entries may exceed the limitation. get_max_curbe_size will ensure we set
a smallest valid size here.

Any further comment?


More information about the Beignet mailing list