<div dir="ltr"><span style="font-size:12.8000001907349px">> Whether you need to pass the image</span><br style="font-size:12.8000001907349px"><span style="font-size:12.8000001907349px">> dimensions and data type to the kernel as explicit parameters or not</span><br style="font-size:12.8000001907349px"><span style="font-size:12.8000001907349px">> (e.g. because your hardware already supports some sort of image metadata</span><br style="font-size:12.8000001907349px"><span style="font-size:12.8000001907349px">> query opcode) is driver-specific, so it would probably be a better idea</span><br style="font-size:12.8000001907349px"><span style="font-size:12.8000001907349px">> to append these parameters at the end of the input buffer in the r600</span><br style="font-size:12.8000001907349px"><span style="font-size:12.8000001907349px">> pipe driver itself.</span><br><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">The image data type and channel order get converted to a pipe format value in clover before it reaches the driver, so either the driver needs to convert it back to OpenCL constants, or clover has to send the data explicitly to the pipe e.g. by appending it to the kernel parameters. What solution do you think is less problematic? Are there better mechanisms to send the image data type and channel order in OpenCL format to the pipe?</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">Furthermore IIRC the pipe has no explicit information on the number of image arguments, so it would have to gather that information to determine the amount of input memory needed by the kernel. Clover has that information easily accessible.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">How bad would it be if image attributes were handled similarly to grid dimension and grid offset? The get_image_* builtins could be lowered to implicit parameter reads, so there is no need to perform the llvm pass.</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 9, 2015 at 11:09 AM, Zoltán Gilián <span dir="ltr"><<a href="mailto:zoltan.gilian@gmail.com" target="_blank">zoltan.gilian@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Ok, thanks for the feedback.</div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jun 8, 2015 at 2:18 PM, Francisco Jerez <span dir="ltr"><<a href="mailto:currojerez@riseup.net" target="_blank">currojerez@riseup.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>Zoltan Gilian <<a href="mailto:zoltan.gilian@gmail.com" target="_blank">zoltan.gilian@gmail.com</a>> writes:<br>
<br>
> Image attributes are passed to the kernel as hidden parameters after the<br>
> image attribute itself. An llvm pass replaces the getter builtins to<br>
> the appropriate parameters.<br>
> ---<br>
>  src/gallium/state_trackers/clover/core/kernel.cpp  |  13 ++<br>
>  src/gallium/state_trackers/clover/core/memory.cpp  |   2 +-<br>
>  .../state_trackers/clover/llvm/invocation.cpp      | 158 ++++++++++++++++++++-<br>
>  3 files changed, 170 insertions(+), 3 deletions(-)<br>
><br>
> diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp<br>
> index 0756f06..4703899 100644<br>
> --- a/src/gallium/state_trackers/clover/core/kernel.cpp<br>
> +++ b/src/gallium/state_trackers/clover/core/kernel.cpp<br>
> @@ -483,6 +483,19 @@ kernel::image_rd_argument::bind(exec_context &ctx,<br>
>     align(ctx.input, marg.target_align);<br>
>     insert(ctx.input, v);<br>
><br>
> +   cl_image_format fmt = img->format();<br>
> +   cl_uint image_attribs[] = {img->width(), img->height(), img->depth(),<br>
> +                              fmt.image_channel_data_type,<br>
> +                              fmt.image_channel_order};<br>
> +   for (int i = 0; i < 5; i++) {<br>
> +      auto v = bytes(image_attribs[i]);<br>
> +<br>
> +      extend(v, module::argument::zero_ext, marg.target_size);<br>
> +      byteswap(v, ctx.q->device().endianness());<br>
> +      align(ctx.input, marg.target_align);<br>
> +      insert(ctx.input, v);<br>
> +   }<br>
> +<br>
</div></div>This seems to be implementing driver-specific policy in a<br>
hardware-independent state tracker.  Whether you need to pass the image<br>
dimensions and data type to the kernel as explicit parameters or not<br>
(e.g. because your hardware already supports some sort of image metadata<br>
query opcode) is driver-specific, so it would probably be a better idea<br>
to append these parameters at the end of the input buffer in the r600<br>
pipe driver itself.<br>
<div><div><br>
>     st = img->resource(*ctx.q).bind_sampler_view(*ctx.q);<br>
>     ctx.sviews.push_back(st);<br>
>  }<br>
> diff --git a/src/gallium/state_trackers/clover/core/memory.cpp b/src/gallium/state_trackers/clover/core/memory.cpp<br>
> index 055336a..b852e68 100644<br>
> --- a/src/gallium/state_trackers/clover/core/memory.cpp<br>
> +++ b/src/gallium/state_trackers/clover/core/memory.cpp<br>
> @@ -189,7 +189,7 @@ image2d::image2d(clover::context &ctx, cl_mem_flags flags,<br>
>                   const cl_image_format *format, size_t width,<br>
>                   size_t height, size_t row_pitch,<br>
>                   void *host_ptr) :<br>
> -   image(ctx, flags, format, width, height, 0,<br>
> +   image(ctx, flags, format, width, height, 1,<br>
>           row_pitch, 0, height * row_pitch, host_ptr) {<br>
>  }<br>
><br>
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp<br>
> index 9b91fee..5d5e619 100644<br>
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp<br>
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp<br>
> @@ -33,6 +33,8 @@<br>
>  #include <llvm/IR/DiagnosticInfo.h><br>
>  #include <llvm/IR/DiagnosticPrinter.h><br>
>  #include <llvm/IR/DerivedTypes.h><br>
> +#include <llvm/IR/InstIterator.h><br>
> +#include <llvm/IR/Instructions.h><br>
>  #include <llvm/IR/LLVMContext.h><br>
>  #include <llvm/IR/Module.h><br>
>  #include <llvm/Support/SourceMgr.h><br>
> @@ -80,6 +82,130 @@<br>
>  using namespace clover;<br>
><br>
>  namespace {<br>
> +<br>
> +   /* LLVM pass to resolve get_image_* OpenCL builtins.<br>
> +    * The image attributes (e.g. width or channel order) are passed as hidden<br>
> +    * arguments to the kernel.<br>
> +    * This pass replaces specific function calls with the appropriate hidden<br>
> +    * arguments. The libclc library needs to implements the get_image_*<br>
> +    * builtins as these specific functions to avoid dealing with name mangling<br>
> +    * here.<br>
> +    */<br>
> +   struct OpenCLImageBuiltinPass : public llvm::FunctionPass {<br>
> +      static char ID;<br>
> +<br>
> +      OpenCLImageBuiltinPass(): llvm::FunctionPass(ID) {}<br>
> +      bool runOnFunction(llvm::Function &F) override;<br>
> +<br>
> +      struct ImageAttribArgs {<br>
> +         ImageAttribArgs(): image_arg(0),<br>
> +                            width_arg(0),<br>
> +                            heigth_arg(0),<br>
> +                            depth_arg(0),<br>
> +                            channel_data_type_arg(0),<br>
> +                            channel_order_arg(0) {}<br>
> +         llvm::Argument* image_arg;<br>
> +         llvm::Argument* width_arg;<br>
> +         llvm::Argument* heigth_arg;<br>
> +         llvm::Argument* depth_arg;<br>
> +         llvm::Argument* channel_data_type_arg;<br>
> +         llvm::Argument* channel_order_arg;<br>
> +      };<br>
> +   };<br>
> +<br>
> +   char OpenCLImageBuiltinPass::ID = 0;<br>
> +<br>
> +   bool<br>
> +   OpenCLImageBuiltinPass::runOnFunction(llvm::Function& F)<br>
> +   {<br>
> +      llvm::Module* mod = F.getParent();<br>
> +      llvm::DataLayout TD(mod);<br>
> +      llvm::Type* cl_int_type =<br>
> +         TD.getSmallestLegalIntType(mod->getContext(), sizeof(cl_int));<br>
> +<br>
> +      std::vector<ImageAttribArgs> img_args;<br>
> +      for (auto arg = F.arg_begin(), E = F.arg_end(); arg != E; ++arg) {<br>
> +<br>
> +         llvm::Type *arg_type = arg->getType();<br>
> +         if (!arg_type->isPointerTy()) continue;<br>
> +<br>
> +         llvm::Type *elem_type = arg_type->getPointerElementType();<br>
> +         if (!elem_type->isStructTy()) continue;<br>
> +<br>
> +         const llvm::StringRef &type_name = elem_type->getStructName();<br>
> +         if (!type_name.startswith("opencl.image2d_t")) continue;<br>
> +<br>
> +         auto name_suffix = llvm::Twine(img_args.size());<br>
> +         ImageAttribArgs attrib_args;<br>
> +         attrib_args.image_arg = arg;<br>
> +         attrib_args.width_arg = new llvm::Argument(<br>
> +            cl_int_type, "image_width" + name_suffix);<br>
> +         attrib_args.heigth_arg = new llvm::Argument(<br>
> +            cl_int_type, "image_height" + name_suffix);<br>
> +         attrib_args.depth_arg = new llvm::Argument(<br>
> +            cl_int_type, "image_depth" + name_suffix);<br>
> +         attrib_args.channel_data_type_arg = new llvm::Argument(<br>
> +            cl_int_type, "image_channel_data_type" + name_suffix);<br>
> +         attrib_args.channel_order_arg = new llvm::Argument(<br>
> +            cl_int_type, "image_channel_order" + name_suffix);<br>
> +<br>
> +         auto& args = F.getArgumentList();<br>
> +         args.insertAfter(arg, attrib_args.channel_order_arg);<br>
> +         args.insertAfter(arg, attrib_args.channel_data_type_arg);<br>
> +         args.insertAfter(arg, attrib_args.depth_arg);<br>
> +         args.insertAfter(arg, attrib_args.heigth_arg);<br>
> +         args.insertAfter(arg, attrib_args.width_arg);<br>
> +<br>
> +         img_args.push_back(attrib_args);<br>
> +      }<br>
> +      std::vector<llvm::Instruction*> insts_to_erase;<br>
> +      for (auto I = llvm::inst_begin(F), E = llvm::inst_end(F); I != E; ++I) {<br>
> +         llvm::CallInst* callInst = llvm::dyn_cast<llvm::CallInst>(&*I);<br>
> +         if (!callInst) continue;<br>
> +<br>
> +         for (auto& op: callInst->arg_operands()) {<br>
> +            auto op_arg = op.get();<br>
> +            auto image_arg = std::find_if(<br>
> +               img_args.begin(), img_args.end(),<br>
> +               [op_arg](const ImageAttribArgs& x) {<br>
> +                  return x.image_arg == op_arg;<br>
> +               });<br>
> +            if (image_arg == img_args.end()) continue;<br>
> +<br>
> +            llvm::Function* callee = callInst->getCalledFunction();<br>
> +            if (!callee) continue;<br>
> +<br>
> +            auto callee_name = callee->getName();<br>
> +            if (callee_name.startswith(<br>
> +                  "llvm.opencl.image.get.width")) {<br>
> +               callInst->replaceAllUsesWith(image_arg->width_arg);<br>
> +               insts_to_erase.push_back(callInst);<br>
> +            } else if (callee_name.startswith(<br>
> +                  "llvm.opencl.image.get.height")) {<br>
> +               callInst->replaceAllUsesWith(image_arg->heigth_arg);<br>
> +               insts_to_erase.push_back(callInst);<br>
> +            } else if (callee_name.startswith(<br>
> +                  "llvm.opencl.image.get.depth")) {<br>
> +               callInst->replaceAllUsesWith(image_arg->depth_arg);<br>
> +               insts_to_erase.push_back(callInst);<br>
> +            } else if (callee_name.startswith(<br>
> +                  "llvm.opencl.image.get.channel_data_type")) {<br>
> +               callInst->replaceAllUsesWith(image_arg->channel_data_type_arg);<br>
> +               insts_to_erase.push_back(callInst);<br>
> +            } else if (callee_name.startswith(<br>
> +                  "llvm.opencl.image.get.channel_order")) {<br>
> +               callInst->replaceAllUsesWith(image_arg->channel_order_arg);<br>
> +               insts_to_erase.push_back(callInst);<br>
> +            }<br>
> +         }<br>
> +      }<br>
> +      for (auto inst: insts_to_erase) {<br>
> +         inst->eraseFromParent();<br>
> +      }<br>
> +<br>
> +      return (img_args.size() > 0);<br>
> +   }<br>
> +<br>
>  #if 0<br>
>     void<br>
>     build_binary(const std::string &source, const std::string &target,<br>
> @@ -263,10 +389,21 @@ namespace {<br>
>                                                          sizeof(address_spaces));<br>
><br>
>  #if HAVE_LLVM >= 0x0306<br>
> -      return act.takeModule().release();<br>
> +      llvm::Module * module = act.takeModule().release();<br>
> +#else<br>
> +      llvm::Module * module = act.takeModule();<br>
> +#endif<br>
> +<br>
> +      // Resolve get_image_* builtins.<br>
> +#if HAVE_LLVM >= 0x0307<br>
> +      llvm::legacy::PassManager PM;<br>
>  #else<br>
> -      return act.takeModule();<br>
> +      llvm::PassManager PM;<br>
>  #endif<br>
> +      PM.add(new OpenCLImageBuiltinPass());<br>
> +      PM.run(*module);<br>
> +<br>
> +      return module;<br>
>     }<br>
><br>
>     void<br>
> @@ -375,6 +512,23 @@ namespace {<br>
>           }<br>
><br>
>           if (arg_type->isPointerTy()) {<br>
> +            // XXX: Figure out LLVM->OpenCL address space mappings for each<br>
> +            // target.  I think we need to ask clang what these are.  For now,<br>
> +            // pretend everything is in the global address space.<br>
> +            llvm::Type *elem_type = arg_type->getPointerElementType();<br>
> +            if (elem_type->isStructTy()) {<br>
> +               const llvm::StringRef &type_name = elem_type->getStructName();<br>
> +<br>
> +               if (type_name.startswith("opencl.image2d_t")) {<br>
> +                  args.push_back(module::argument(module::argument::image2d_rd,<br>
> +                                 arg_store_size, target_size, target_align,<br>
> +                                 module::argument::zero_ext));<br>
> +                  // Skip hidden image attribute arguments.<br>
> +                  // XXX: Do this based on some kind of metadata.<br>
> +                  std::advance(I, 5);<br>
> +                  continue;<br>
> +               }<br>
> +            }<br>
>              unsigned address_space = llvm::cast<llvm::PointerType>(arg_type)->getAddressSpace();<br>
>              if (address_space == address_spaces[clang::LangAS::opencl_local<br>
>                                                       - clang::LangAS::Offset]) {<br>
> --<br>
> 2.4.2<br>
><br>
</div></div>> _______________________________________________<br>
> mesa-dev mailing list<br>
> <a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
> <a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev" target="_blank">http://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>