[Mesa-dev] [PATCH 10/11] clover: Add function for building a clover::module for non-TGSI targets v2

Francisco Jerez currojerez at riseup.net
Wed May 23 05:04:20 PDT 2012


Tom Stellard <tstellar at gmail.com> writes:

> v2:
>   -Separate IR type and LLVM triple
>   -Do the OpenCL C->LLVM IR and linking steps for all PIPE_SHADER_IR
>    types.
> ---
>
> There really isn't much functional change in this patch from v1, the only
> really difference is that the decision to emit LLVM or native machine code
> is made after the linking with libclc and the decision is now based on the
> preferred IR rather than the target.
>
My main concern was that you were using the "preferred IR" cap for two
different purposes at the same time: selecting the IR format itself and
selecting a specific set of parameters that describe the processor it's
going to be running on.

Apparently this is still the case, I thought we had agreed to limit the
"preferred IR" cap to: TGSI, LLVM and NATIVE.  In addition to the
preferred IR you'll need a separate compute cap to determine the
specific target to build for -- I guess I'd use a string cap, its
meaning is going to be LLVM-specific anyway, and it's just a temporary
thing until the clang API is fixed.

> IMHO returning targeted LLVM IR is the best approach for drivers that
> would rather use LLVM IR than TGSI.  It gives the drivers control over
> how they invoke LLVM and provides the lowest barrier of entry to drivers
> who want to use LLVM with clover.  This also is closer to the way LLVM
> is handled by drivers for other state trackers. r600g, radeonsi, and
> llvmpipe already accept LLVM IR from st/mesa (via gallivm) and handle
> compiling it to machine code in the driver.
>
> Other solutions that were discussed were:
>
> 1. Passing target independent LLVM IR to the drivers
>
>   - This isn't currently possible with the Clang API.
>
> 2. Passing native machine code directly to the drivers and requiring LLVM
>    backends to support emitting code in a "Clover Object format"
>
>   - I think the biggest drawback to this idea is that in order for llvmpipe to
>     work this way, it would require modifying the X86 llvm backend (and
>     possibly other CPU backends) so that they can emit the "Clover Object format"
>     Besides technical challenges it is unclear whether a change like this would be
>     accepted by the respective backend maintainers.

Well...  My point there wasn't to have all hardware back-ends use the
"clover object format", but rather to have all of them agree on ANY
object format.  The "clover object format" is just the simplest thing I
could come up with, but there's no reason not to use something more
standard.

And yes, passing machine code directly through the API is something that
some compute APIs support and I think we'll want to do at some point
(this includes r600), so the user can avoid compiling a program
repeatedly by building it off-line.

As for llvmpipe, my plan was to fix it to support the new
compute-related TGSI opcodes.  Otherwise it's going to be limited to the
compute state trackers built on top of clang/LLVM (which is unlikely to
be the case of e.g. DirectCompute).

Some more comments in-line.

>
>  .../state_trackers/clover/core/compiler.hpp        |    2 +
>  src/gallium/state_trackers/clover/core/program.cpp |   13 ++-
>  .../state_trackers/clover/llvm/invocation.cpp      |  171 ++++++++++++++++++--
>  3 files changed, 171 insertions(+), 15 deletions(-)
>
> diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp b/src/gallium/state_trackers/clover/core/compiler.hpp
> index a3998d5..0ef69d1 100644
> --- a/src/gallium/state_trackers/clover/core/compiler.hpp
> +++ b/src/gallium/state_trackers/clover/core/compiler.hpp
> @@ -25,6 +25,7 @@
>  
>  #include "core/compat.hpp"
>  #include "core/module.hpp"
> +#include "pipe/p_defines.h"
>  
>  namespace clover {
>     class build_error {
> @@ -44,6 +45,7 @@ namespace clover {
>     };
>  
>     module compile_program_llvm(const compat::string &source,
> +                               enum pipe_shader_ir ir,
>                                 const compat::string &target);
>  
>     module compile_program_tgsi(const compat::string &source,
> diff --git a/src/gallium/state_trackers/clover/core/program.cpp b/src/gallium/state_trackers/clover/core/program.cpp
> index 5ac9f93..d53cc94 100644
> --- a/src/gallium/state_trackers/clover/core/program.cpp
> +++ b/src/gallium/state_trackers/clover/core/program.cpp
> @@ -47,9 +47,16 @@ _cl_program::build(const std::vector<clover::device *> &devs) {
>  
>     for (auto dev : devs) {
>        try {
> -         auto module = (dev->ir_target() == "tgsi" ?
> -                        compile_program_tgsi(__source, dev->ir_target()) :
> -                        compile_program_llvm(__source, dev->ir_target()));
> +         // XXX: We need to check the input source to determine which
> +         //      compile_program() call to use.  If the input is TGSI we
> +         //      should use compile_program_tgsi, otherwise we should use
> +         //      compile_program_llvm
> +         auto module = (dev->ir_target() == PIPE_SHADER_IR_TGSI ?
> +                        compile_program_tgsi(__source, "") : // XXX Not sure
> +			                                     // what value to
> +							     // use for target.
> +                        compile_program_llvm(__source, dev->ir_target(),
> +			                     dev->llvm_triple()));

I think the current name of "dev->ir_target()" isn't going to make much
sense anymore after you split it in two.  This should be something like:

|          auto module = (dev->ir_format() == PIPE_SHADER_IR_TGSI ?
|                         compile_program_tgsi(__source, dev->ir_target()) :
|                         compile_program_llvm(__source, dev->ir_format(),
|                                              dev->ir_target()));

>           __binaries.insert({ dev, module });
>  
>        } catch (build_error &e) {
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 89e21bf..a1de6e1 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -22,24 +22,33 @@
>  
>  #include "core/compiler.hpp"
>  
> -#if 0
>  #include <clang/Frontend/CompilerInstance.h>
>  #include <clang/Frontend/TextDiagnosticPrinter.h>
>  #include <clang/CodeGen/CodeGenAction.h>
> +#include <llvm/Bitcode/BitstreamWriter.h>
> +#include <llvm/Bitcode/ReaderWriter.h>
> +#include <llvm/DerivedTypes.h>
> +#include <llvm/Linker.h>
>  #include <llvm/LLVMContext.h>
> +#include <llvm/Module.h>
> +#include <llvm/PassManager.h>
>  #include <llvm/Support/TargetSelect.h>
>  #include <llvm/Support/MemoryBuffer.h>
> +#include <llvm/Support/PathV1.h>
> +#include <llvm/Target/TargetData.h>
> +#include <llvm/Transforms/IPO/PassManagerBuilder.h>
> +
> +#include "util/u_memory.h"
>  
>  #include <iostream>
>  #include <iomanip>
>  #include <fstream>
>  #include <cstdio>
> -#endif
>  
>  using namespace clover;
>  
> -#if 0
>  namespace {
> +#if 0
>     void
>     build_binary(const std::string &source, const std::string &target,
>                  const std::string &name) {
> @@ -78,17 +87,155 @@ namespace {
>        compat::istream cs(str);
>        return module::deserialize(cs);
>     }
> -}
> +#endif
> +module
> +build_module_llvm(const std::string &source, const std::string &name,
> +                  enum pipe_shader_ir ir, const std::string &triple) {

This function is huge and seems to take care of three completely
unrelated tasks (compiling, linking, and building the symbol table),
have you considered splitting it into several parts?

> +
> +      /* Compile the kernel */

Please use C++ comment syntax in C++ code.

> +      clang::CompilerInstance c;
> +      module m;
> +      clang::EmitLLVMOnlyAction act(&llvm::getGlobalContext());
> +      std::string log;
> +      llvm::raw_string_ostream s_log(log);
> +
> +#if HAVE_LLVM <= 0x0300
> +      c.getFrontendOpts().Inputs.push_back(
> +         std::make_pair(clang::IK_OpenCL, name));
> +#else
> +      c.getFrontendOpts().Inputs.push_back(
> +         clang::FrontendInputFile(name, clang::IK_OpenCL));
> +#endif

Do we need this compatibility code?  Don't you have to build a patched
version of LLVM from sources for this to work anyway?

> +      c.getFrontendOpts().ProgramAction = clang::frontend::EmitLLVMOnly;
> +      c.getHeaderSearchOpts().UseBuiltinIncludes = true;
> +#if HAVE_LLVM < 0x0300
> +      c.getHeaderSearchOpts().UseStandardIncludes = true;
> +#else
> +      c.getHeaderSearchOpts().UseStandardSystemIncludes = true;
> +#endif
> +      c.getHeaderSearchOpts().ResourceDir = CLANG_RESOURCE_DIR;
> +
> +      /* Add libclc generic search path */
> +      c.getHeaderSearchOpts().AddPath(LIBCLC_PATH "/generic/include/",
> +                                      clang::frontend::Angled,
> +                                      false, false, false);
> +
> +      /* Add libclc include */
> +      c.getPreprocessorOpts().Includes.push_back("clc/clc.h");
> +      /* clc.h requires that this macro be defined: */
> +      c.getPreprocessorOpts().addMacroDef("cl_clang_storage_class_specifiers");
> +
> +      c.getLangOpts().NoBuiltin = true;
> +      c.getTargetOpts().Triple = triple;
> +      c.getInvocation().setLangDefaults(clang::IK_OpenCL);
> +      c.createDiagnostics(0, NULL, new clang::TextDiagnosticPrinter(
> +                             s_log, c.getDiagnosticOpts()));
> +
> +      c.getPreprocessorOpts().addRemappedFile(
> +         "cl_input", llvm::MemoryBuffer::getMemBuffer(source));
> +
> +      /* Compile the code */
> +      if (!c.ExecuteAction(act))
> +         throw build_error(log);
> +
> +      /* Link the kernel with libclc */
> +      llvm::PassManager PM;
> +      llvm::PassManagerBuilder Builder;
> +      bool isNative;
> +      llvm::Module * mod = act.takeModule();

This pointer/reference declaration syntax is inconsistent with the rest
of the code.

> +      llvm::Linker linker("clover", mod);
> +
> +      linker.LinkInFile(llvm::sys::Path(LIBCLC_PATH + triple + "/lib/builtins.bc"), isNative);
> +      mod = linker.releaseModule();
> +
> +      /* Run link time optimizations */
> +      Builder.populateLTOPassManager(PM, false, true);
> +      Builder.OptLevel = 2;
> +      PM.run(*mod);
> +
> +      /* Build the clover::module */
> +      switch (ir) {
> +         case PIPE_SHADER_IR_TGSI:
> +            /*XXX: Handle TGSI */
> +            assert(0);
> +         default: break;
> +      }
> +
> +      unsigned char * prog;
> +      uint32_t prog_sz;
> +
> +#if HAVE_LLVM > 0x0300
> +      llvm::SmallVector<char, 1024> llvm_bitcode;
> +      llvm::raw_svector_ostream bitcode_ostream(llvm_bitcode);
> +#else
> +      std::vector<unsigned char> llvm_bitcode;
> +#endif
> +      llvm::BitstreamWriter writer(llvm_bitcode);
> +
> +#if HAVE_LLVM <= 0x0300
> +      llvm::WriteBitcodeToStream(mod, writer);
> +#else
> +      llvm::WriteBitcodeToFile(mod, bitcode_ostream);
> +      bitcode_ostream.flush();
>  #endif
>  
> +      prog_sz = llvm_bitcode.size() * sizeof(unsigned char);
> +
> +      /* We need to add 4 to the program size, because we will
> +       * be preprending the length of the program to the bitcode string. */
> +      prog = (unsigned char *)MALLOC(prog_sz + 4);
> +      ((uint32_t *)prog)[0] = prog_sz;
> +      memcpy(prog + 4, &llvm_bitcode[0], prog_sz);
> +
It would be nice to have some sort of struct for this.

> +      std::string kernel_name;
> +      compat::vector<module::argument> args;
> +      const llvm::NamedMDNode * kernel_node =
> +                                    mod->getNamedMetadata("opencl.kernels");
> +      /* XXX: Support more than one kernel */
> +      /* XXX: Error if there are no kernels */

Nope, there's no need to error here if there are no kernels defined.

> +      assert(kernel_node->getNumOperands() == 1);
> +
> +      llvm::Function * kernel_func = llvm::dyn_cast<llvm::Function>(
> +                                      kernel_node->getOperand(0)->getOperand(0));
> +      kernel_name = kernel_func->getName();
> +
> +      for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
> +                                      E = kernel_func->arg_end(); I != E; ++I) {
> +         llvm::Argument & arg = *I;
> +         llvm::Type * arg_type = arg.getType();
> +         llvm::TargetData TD(kernel_func->getParent());
> +         unsigned arg_size = TD.getTypeStoreSize(arg_type);
> +
> +         if (llvm::isa<llvm::PointerType>(arg_type) and arg.hasByValAttr()) {

"and"?  Please use "&&".

> +            arg_type =
> +               llvm::dyn_cast<llvm::PointerType>(arg_type)->getElementType();
> +         }
> +
> +         if (arg_type->isPointerTy()) {
> +            /* XXX: Figure out LLVM->OpenCL address space mappings for each
> +             * target.  I think we need to ask clang what these are.  For now,
> +             * pretend everything is in the global address space. */

Aren't these supposed to be defined arbitrarily by the standard library
using "address_space" attributes?

> +            unsigned address_space = llvm::cast<llvm::PointerType>(arg_type)->getAddressSpace();
> +            switch (address_space) {
> +               default:
> +                  args.push_back(module::argument(module::argument::global, arg_size));
> +                  break;
> +            }
> +         } else {
> +            args.push_back(module::argument(module::argument::scalar, arg_size));
> +         }
> +      }
> +      m.syms.push_back(module::symbol(kernel_name, 0, 0, args ));
> +      m.secs.push_back(module::section(0, module::section::text, prog_sz + 4,
> +                       compat::vector<char>((char *)prog, prog_sz + 4)));
> +      return m;
> +   }
> +}
> +
>  module
>  clover::compile_program_llvm(const compat::string &source,
> -                             const compat::string &target) {
> -#if 0
> -   build_binary(source, target, "cl_input");
> -   module m = load_binary("cl_input.o");
> -   std::remove("cl_input.o");
> -   return m;
> -#endif
> -   return module();
> +                             enum pipe_shader_ir ir,
> +                             const compat::string &triple) {
> +
> +   return build_module_llvm(source, "cl_input", ir, triple);
>  }
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 229 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20120523/d76579e7/attachment-0001.pgp>


More information about the mesa-dev mailing list