[PATCH i-g-t 3/5] lib/intel_compute: Allow the user to provide a custom compute kernel

Mon Feb 24 07:17:59 UTC 2025

On Fri, Feb 21, 2025 at 02:37:42PM +0100, Francois Dugast wrote:
> Hi Zbigniew,
> 
> On Thu, Feb 13, 2025 at 08:00:06AM +0100, Zbigniew Kempczyński wrote:
> > On Wed, Feb 05, 2025 at 11:17:04AM +0100, Francois Dugast wrote:
> > > Allow the user to provide a custom compute kernel which will be used
> > > instead of the default compute square one. This will be helpful to
> > > try out corner cases which require a specific compute kernel.
> > > 
> > > Signed-off-by: Francois Dugast <francois.dugast at intel.com>
> > > ---
> > >  lib/intel_compute.c | 26 ++++++++++++++++++--------
> > >  lib/intel_compute.h |  2 ++
> > >  2 files changed, 20 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/lib/intel_compute.c b/lib/intel_compute.c
> > > index e0776fb6d..a826d58c0 100644
> > > --- a/lib/intel_compute.c
> > > +++ b/lib/intel_compute.c
> > > @@ -1770,6 +1770,8 @@ static bool __run_intel_compute_kernel(int fd,
> > >  	unsigned int batch;
> > >  	const struct intel_compute_kernels *kernels = intel_compute_square_kernels;
> > >  	enum intel_driver driver = get_intel_driver(fd);
> > > +	const unsigned char *kernel;
> > > +	unsigned int kernel_size;
> > >  
> > >  	for (batch = 0; batch < ARRAY_SIZE(intel_compute_batches); batch++) {
> > >  		if (ip_ver == intel_compute_batches[batch].ip_ver)
> > > @@ -1787,16 +1789,24 @@ static bool __run_intel_compute_kernel(int fd,
> > >  		return false;
> > >  	}
> > >  
> > > -	while (kernels->kernel) {
> > > -		if (ip_ver == kernels->ip_ver)
> > > -			break;
> > > -		kernels++;
> > > +	/* If the user provides a kernel, use it */
> > > +	if (user && user->kernel) {
> > > +		kernel = user->kernel;
> > > +		kernel_size = user->kernel_size;
> > > +	} else {
> > > +		while (kernels->kernel) {
> > > +			if (ip_ver == kernels->ip_ver)
> > > +				break;
> > > +			kernels++;
> > > +		}
> > > +		if (!kernels->kernel)
> > > +			return false;
> > > +		kernel = kernels->kernel;
> > > +		kernel_size = kernels->size;
> > 
> > According to how we build pipeline indirect data we're limited to
> > three arguments - *input, *output, count. Allowing the user to provide
> > a custom kernel won't work unless type constraint will be met.
> > I don't like this change, because someone who doesn't know how
> > this work will start providing its own kernels will be surprised
> > it doesn't work.
> 
> The way we statically build the pipeline indirect data is a good
> balance of simplicity and flexibility, as we can test a lot even
> with the constraint *input, *output, count.
> 
> More complex KMD tests will require simple specific kernels which
> still comply with this constraint. For example the one below can
> trigger a page fault at 0x10000 from the compute kernel context
> so that we run other checks in KMD, all from IGT:
> 
>     __kernel void square(__global float* input,
>                          __global float* output,
>                          const unsigned int count) {
>         int i = get_global_id(0);
>         const __global uint* addr = 0x10000;
>         output[i] = *addr;
>     }
> 

> This is the reason for allowing custom compute kernels. This way
> we can leverage the existing lib/intel_compute infrastructure to
> test corner cases. I believe having this possibility is far more
> important than the risk of a user incorrectly expecting IGT to
> provide a full "compute runtime" able to run any kernel.

Ok, above code is a good argument for allowing custom kernels,
especially when you want to trigger pf from EU.

> 
> Would you be fine if the doc of struct user_execenv::kernel would
> make the constraint of *input, *output, count explicit?

Yes, but please alter count in indirect data fill, currently
create_indirect_data() has it hardcoded immediately after
input and output addresses.

> 
> If not, what about replacing "const unsigned char *kernel" with
> an enum which would be used to obtain the right kernel from a
> library of "approved" kernels in intel_compute_kernels.c?

Strict kernel prototype is fine for me.

Thanks for the explanation.

--
Zbigniew

> 
> Francois
> 
> > 
> > --
> > Zbigniew
> > 
> > >  	}
> > > -	if (!kernels->kernel)
> > > -		return false;
> > >  
> > > -	intel_compute_batches[batch].compute_exec(fd, kernels->kernel,
> > > -						  kernels->size, eci, user);
> > > +	intel_compute_batches[batch].compute_exec(fd, kernel,
> > > +						  kernel_size, eci, user);
> > >  
> > >  	return true;
> > >  }
> > > diff --git a/lib/intel_compute.h b/lib/intel_compute.h
> > > index c4b4ee5e1..6096bb83a 100644
> > > --- a/lib/intel_compute.h
> > > +++ b/lib/intel_compute.h
> > > @@ -35,6 +35,8 @@ struct intel_compute_kernels {
> > >  
> > >  struct user_execenv {
> > >  	uint32_t vm;
> > > +	const unsigned char *kernel;
> > > +	unsigned int kernel_size;
> > >  };
> > >  
> > >  extern const struct intel_compute_kernels intel_compute_square_kernels[];
> > > -- 
> > > 2.43.0
> > >