[Mesa-dev] [PATCH 0/3] cl workdim v2

Jan Vesely jan.vesely at rutgers.edu
Tue Sep 2 12:37:22 PDT 2014


On Tue, 2014-09-02 at 15:36 +0300, Francisco Jerez wrote:
> Jan Vesely <jan.vesely at rutgers.edu> writes:
> 
> > On Sat, 2014-08-16 at 13:13 +0300, Francisco Jerez wrote:
> >> Jan Vesely <jan.vesely at rutgers.edu> writes:
> >> 
> >> > On Thu, 2014-08-07 at 16:02 +0300, Francisco Jerez wrote:
> >> >> Jan Vesely <jan.vesely at rutgers.edu> writes:
> >> >> 
> >> >> > This respin includes Francisco's approach of providing implicit
> >> >> > in the arg vector passed from clover, and Tom's idea of appending
> >> >> > implicit args after the kernel args.
> >> >> >
> >> >> 
> >> >> Hmmm...  Maybe it would make sense to add some sort of versioning
> >> >> (e.g. as part of the target triple) to the binary interface between
> >> >> clover and the kernel instead, so we can handle this sort of
> >> >> non-backwards compatible changes and the compiler back-end and libclc
> >> >> have some way to find out whether some specific feature is available and
> >> >> e.g. some specific extension should be enabled.
> >> >> 
> >> >> > I assumed it's not safe to modify exec.input, so the input vector is copied
> >> >> > before appending work dim.
> >> >> >
> >> >> 
> >> >> Why wouldn't it be safe?  You just need to make sure they're appended
> >> >> before the compute state is created.
> >> >
> >> > I thought there might be a problem when called from multiple threads,
> >> > but it looks like most of the vars are local to the current call anyway.
> >> >
> >> > I looked at the code a bit better, and need a bit of help with what the
> >> > proffered approach would be.
> >> >
> >> > exec_context::bind() appends all kernel args to the input vector. If the
> >> > implicit args are added before bind() it shifts all other args, which is
> >> > not what we want.
> >> > if the implicit args are appended after, they are not accounted for in
> >> > shader->input_size (and not copied by the driver).
> >> >
> >> > my current code modifies exec_context::bind() to preserve the content of
> >> > input before binding kernel args, and append the old content after the
> >> > args are bound.
> >> > I have also considered passing and implicit args vector to
> >> > exec_context::bind to make the trick more visible.
> >> >
> >> > Turning workdim into a proper arg in _args does not work either, because
> >> > it is not present in module args.
> >> >
> >> > any thoughts?
> >> >
> >> 
> >> I finally had a chance to take a closer look at your series.  It looks
> >> like you're right: In order to implement my proposal cleanly, implicit
> >> arguments would have to be part of the _args array so the compiler would
> >> have to include them in the module argument lists with memory layout
> >> parameters (e.g. alignment, size) suitable for the hardware, so there's
> >> probably little benefit compared to your original approach that includes
> >> the number of dimensions as an additional launch_grid() parameter.
> >> 
> >> So we don't have to change it again, can you add another array parameter
> >> for the base grid offset?  That's another thing we don't pass through
> >> the pipe driver API currently and CL requires. The prototype of
> >> launch_grid could look like:
> >> 
> >> | void (*launch_grid)(struct pipe_context *context, uint dims,
> >> |                     const uint *block_layout, const uint *grid_layout,
> >> |                     const uint *grid_offset, uint32_t pc,
> >> |                     const void *input);
> >> 
> >> And don't forget to update the docs. :)
> >
> > Hi,
> >
> > I wanted to explore the original idea of appending implicit args, since
> > launch_grid is driver specific and would need to reimplement the same
> > functionality in every driver.
> >
> > I came up with a solution (see the attached patch). I don't like that
> > the implicit arg needs to be set in api function.
> 
> Right, if we do it this way it would probably be a better fit for the
> clover::kernel code, the CL front-end doesn't really need to be aware of
> implicit args.
> 
> > I also don't like that this way there is no difference between
> > explicit and implicit kernel arguments. On the other hand it's simple,
> > and does not need additional per driver code.
> >
> Yeah...  We definitely want to hide these from the user, as e.g. the
> CL_KERNEL_NUM_ARGS param is required by the spec to return the number of
> arguments provided by the user, and we don't want the user to set
> implicit args, so it gets a bit messy.  I think I like better your
> original idea of passing them as launch_grid() arguments, even though
> the grid offset and dimension parameters are somewhat artificial from a
> the hardware's point of view.

sorry to bug you some more with this. I tried one more thing before
going back to the launch_grid parameters. this time it implements a
parallel infrastructure for implicit arguments by creating artificial
module arguments for uint and size_t (I don't think we need more for
implicit arguments).

I only added the work dimension argument but adding more should be easy.
If you think that the launch_grid way is better, I'll stop experimenting
as I ran out of ideas I wanted to try.

thanks,
jan

> 
> > thanks,
> > jan
> >
> >> 
> >> Thank you.
> >> 
> >> > thanks,
> >> > jan
> >> >
> >> >
> >> >> 
> >> >> > Passes get-work-dim piglit on turks without any regression,
> >> >> > I have not tested SI as I don't have the hw.
> >> >> >
> >> >> > jan
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > Jan Vesely (3):
> >> >> >   gallium: Pass input data size to launch_grid
> >> >> >   clover: Add work dimension implicit param to input
> >> >> >   r600,radeonsi: Copy implicit args provided by clover
> >> >> >
> >> >> >  src/gallium/drivers/ilo/ilo_gpgpu.c               |   2 +-
> >> >> >  src/gallium/drivers/nouveau/nvc0/nvc0_compute.c   |   2 +-
> >> >> >  src/gallium/drivers/nouveau/nvc0/nvc0_context.h   |   4 +-
> >> >> >  src/gallium/drivers/nouveau/nvc0/nve4_compute.c   |   2 +-
> >> >> >  src/gallium/drivers/r600/evergreen_compute.c      |  14 +-
> >> >> >  src/gallium/drivers/r600/evergreen_compute.h      |   1 -
> >> >> >  src/gallium/drivers/radeonsi/si_compute.c         |   6 +-
> >> >> >  src/gallium/include/pipe/p_context.h              |   2 +-
> >> >> >  src/gallium/state_trackers/clover/core/kernel.cpp | 162 ++++++++++++----------
> >> >> >  src/gallium/tests/trivial/compute.c               |  40 +++---
> >> >> >  10 files changed, 122 insertions(+), 113 deletions(-)
> >> >> >
> >> >> > -- 
> >> >> > 1.9.3
> >> >
> >> > -- 
> >> > Jan Vesely <jan.vesely at rutgers.edu>
> >
> > -- 
> > Jan Vesely <jan.vesely at rutgers.edu>
> > From 7ad338ebd3a67b19d4ba492fb5a4cbda418fcdad Mon Sep 17 00:00:00 2001
> > From: Jan Vesely <jan.vesely at rutgers.edu>
> > Date: Mon, 1 Sep 2014 19:18:12 -0400
> > Subject: [PATCH RFC 1/1] clover: Append implicit work dim arg
> >
> > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > ---
> >  src/gallium/state_trackers/clover/api/kernel.cpp     |  6 ++++++
> >  .../state_trackers/clover/llvm/invocation.cpp        | 20 ++++++++++++++------
> >  2 files changed, 20 insertions(+), 6 deletions(-)
> >
> > diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp b/src/gallium/state_trackers/clover/api/kernel.cpp
> > index 05cc392..a3b9735 100644
> > --- a/src/gallium/state_trackers/clover/api/kernel.cpp
> > +++ b/src/gallium/state_trackers/clover/api/kernel.cpp
> > @@ -276,6 +276,9 @@ clEnqueueNDRangeKernel(cl_command_queue d_q, cl_kernel d_kern,
> >     auto block_size = validate_block_size(q, kern, dims,
> >                                           d_grid_size, d_block_size);
> >  
> > +   cl_uint work_dim_val = block_size.size();
> > +   kern.args().back().set(sizeof(work_dim_val), &work_dim_val);
> > +
> >     validate_common(q, kern, deps);
> >  
> >     auto hev = create<hard_event>(
> > @@ -299,6 +302,9 @@ clEnqueueTask(cl_command_queue d_q, cl_kernel d_kern,
> >     auto &kern = obj(d_kern);
> >     auto deps = objs<wait_list_tag>(d_deps, num_deps);
> >  
> > +   cl_uint work_dim_val = 1;
> > +   kern.args().back().set(sizeof(work_dim_val), &work_dim_val);
> > +
> >     validate_common(q, kern, deps);
> >  
> >     auto hev = create<hard_event>(
> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > index 7bca0d6..a934384 100644
> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > @@ -315,17 +315,17 @@ namespace {
> >           kernel_func = kernels[i];
> >           kernel_name = kernel_func->getName();
> >  
> > -         for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
> > -                                      E = kernel_func->arg_end(); I != E; ++I) {
> > -            llvm::Argument &arg = *I;
> >  #if HAVE_LLVM < 0x0302
> > -            llvm::TargetData TD(kernel_func->getParent());
> > +         llvm::TargetData TD(kernel_func->getParent());
> >  #elif HAVE_LLVM < 0x0305
> > -            llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
> > +         llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
> >  #else
> > -            llvm::DataLayout TD(mod);
> > +         llvm::DataLayout TD(mod);
> >  #endif
> >  
> > +         for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
> > +                                      E = kernel_func->arg_end(); I != E; ++I) {
> > +            llvm::Argument &arg = *I;
> >              llvm::Type *arg_type = arg.getType();
> >              const unsigned arg_store_size = TD.getTypeStoreSize(arg_type);
> >  
> > @@ -384,6 +384,14 @@ namespace {
> >              }
> >           }
> >  
> > +         // Implicit arguments
> > +         // Work dimensions (cl_uint), uint is 32 bit
> > +         llvm::Type *target_type = llvm::Type::getInt32Ty(mod->getContext());
> > +         args.push_back(module::argument(module::argument::scalar,
> > +                        sizeof(cl_uint), TD.getTypeStoreSize(target_type),
> > +                        TD.getABITypeAlignment(target_type),
> > +                        module::argument::zero_ext));
> > +
> >           m.syms.push_back(module::symbol(kernel_name, 0, i, args ));
> >        }
> >  
> > -- 
> > 1.9.3

-- 
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-clover-save-module-argument-types.patch
Type: text/x-patch
Size: 7904 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140902/463b6c51/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-clover-Add-implicit-arguments-to-kernel.patch
Type: text/x-patch
Size: 3735 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140902/463b6c51/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140902/463b6c51/attachment-0001.sig>


More information about the mesa-dev mailing list