[Mesa-dev] [PATCH 10/11] clover: Add function for building a clover::module for non-TGSI targets v2
Tom Stellard
thomas.stellard at amd.com
Wed May 23 06:55:49 PDT 2012
On Wed, May 23, 2012 at 02:04:20PM +0200, Francisco Jerez wrote:
> Tom Stellard <tstellar at gmail.com> writes:
>
> > v2:
> > -Separate IR type and LLVM triple
> > -Do the OpenCL C->LLVM IR and linking steps for all PIPE_SHADER_IR
> > types.
> > ---
> >
> > There really isn't much functional change in this patch from v1, the only
> > really difference is that the decision to emit LLVM or native machine code
> > is made after the linking with libclc and the decision is now based on the
> > preferred IR rather than the target.
> >
> My main concern was that you were using the "preferred IR" cap for two
> different purposes at the same time: selecting the IR format itself and
> selecting a specific set of parameters that describe the processor it's
> going to be running on.
>
> Apparently this is still the case, I thought we had agreed to limit the
> "preferred IR" cap to: TGSI, LLVM and NATIVE. In addition to the
> preferred IR you'll need a separate compute cap to determine the
> specific target to build for -- I guess I'd use a string cap, its
> meaning is going to be LLVM-specific anyway, and it's just a temporary
> thing until the clang API is fixed.
>
I'll split up these two CAPs.
> > IMHO returning targeted LLVM IR is the best approach for drivers that
> > would rather use LLVM IR than TGSI. It gives the drivers control over
> > how they invoke LLVM and provides the lowest barrier of entry to drivers
> > who want to use LLVM with clover. This also is closer to the way LLVM
> > is handled by drivers for other state trackers. r600g, radeonsi, and
> > llvmpipe already accept LLVM IR from st/mesa (via gallivm) and handle
> > compiling it to machine code in the driver.
> >
> > Other solutions that were discussed were:
> >
> > 1. Passing target independent LLVM IR to the drivers
> >
> > - This isn't currently possible with the Clang API.
> >
> > 2. Passing native machine code directly to the drivers and requiring LLVM
> > backends to support emitting code in a "Clover Object format"
> >
> > - I think the biggest drawback to this idea is that in order for llvmpipe to
> > work this way, it would require modifying the X86 llvm backend (and
> > possibly other CPU backends) so that they can emit the "Clover Object format"
> > Besides technical challenges it is unclear whether a change like this would be
> > accepted by the respective backend maintainers.
>
> Well... My point there wasn't to have all hardware back-ends use the
> "clover object format", but rather to have all of them agree on ANY
> object format. The "clover object format" is just the simplest thing I
> could come up with, but there's no reason not to use something more
> standard.
>
> And yes, passing machine code directly through the API is something that
> some compute APIs support and I think we'll want to do at some point
> (this includes r600), so the user can avoid compiling a program
> repeatedly by building it off-line.
>
> As for llvmpipe, my plan was to fix it to support the new
> compute-related TGSI opcodes. Otherwise it's going to be limited to the
> compute state trackers built on top of clang/LLVM (which is unlikely to
> be the case of e.g. DirectCompute).
>
> Some more comments in-line.
>
> >
> > .../state_trackers/clover/core/compiler.hpp | 2 +
> > src/gallium/state_trackers/clover/core/program.cpp | 13 ++-
> > .../state_trackers/clover/llvm/invocation.cpp | 171 ++++++++++++++++++--
> > 3 files changed, 171 insertions(+), 15 deletions(-)
> >
> > diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp b/src/gallium/state_trackers/clover/core/compiler.hpp
> > index a3998d5..0ef69d1 100644
> > --- a/src/gallium/state_trackers/clover/core/compiler.hpp
> > +++ b/src/gallium/state_trackers/clover/core/compiler.hpp
> > @@ -25,6 +25,7 @@
> >
> > #include "core/compat.hpp"
> > #include "core/module.hpp"
> > +#include "pipe/p_defines.h"
> >
> > namespace clover {
> > class build_error {
> > @@ -44,6 +45,7 @@ namespace clover {
> > };
> >
> > module compile_program_llvm(const compat::string &source,
> > + enum pipe_shader_ir ir,
> > const compat::string &target);
> >
> > module compile_program_tgsi(const compat::string &source,
> > diff --git a/src/gallium/state_trackers/clover/core/program.cpp b/src/gallium/state_trackers/clover/core/program.cpp
> > index 5ac9f93..d53cc94 100644
> > --- a/src/gallium/state_trackers/clover/core/program.cpp
> > +++ b/src/gallium/state_trackers/clover/core/program.cpp
> > @@ -47,9 +47,16 @@ _cl_program::build(const std::vector<clover::device *> &devs) {
> >
> > for (auto dev : devs) {
> > try {
> > - auto module = (dev->ir_target() == "tgsi" ?
> > - compile_program_tgsi(__source, dev->ir_target()) :
> > - compile_program_llvm(__source, dev->ir_target()));
> > + // XXX: We need to check the input source to determine which
> > + // compile_program() call to use. If the input is TGSI we
> > + // should use compile_program_tgsi, otherwise we should use
> > + // compile_program_llvm
> > + auto module = (dev->ir_target() == PIPE_SHADER_IR_TGSI ?
> > + compile_program_tgsi(__source, "") : // XXX Not sure
> > + // what value to
> > + // use for target.
> > + compile_program_llvm(__source, dev->ir_target(),
> > + dev->llvm_triple()));
>
> I think the current name of "dev->ir_target()" isn't going to make much
> sense anymore after you split it in two. This should be something like:
>
> | auto module = (dev->ir_format() == PIPE_SHADER_IR_TGSI ?
> | compile_program_tgsi(__source, dev->ir_target()) :
> | compile_program_llvm(__source, dev->ir_format(),
> | dev->ir_target()));
>
> > __binaries.insert({ dev, module });
> >
> > } catch (build_error &e) {
> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > index 89e21bf..a1de6e1 100644
> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > @@ -22,24 +22,33 @@
> >
> > #include "core/compiler.hpp"
> >
> > -#if 0
> > #include <clang/Frontend/CompilerInstance.h>
> > #include <clang/Frontend/TextDiagnosticPrinter.h>
> > #include <clang/CodeGen/CodeGenAction.h>
> > +#include <llvm/Bitcode/BitstreamWriter.h>
> > +#include <llvm/Bitcode/ReaderWriter.h>
> > +#include <llvm/DerivedTypes.h>
> > +#include <llvm/Linker.h>
> > #include <llvm/LLVMContext.h>
> > +#include <llvm/Module.h>
> > +#include <llvm/PassManager.h>
> > #include <llvm/Support/TargetSelect.h>
> > #include <llvm/Support/MemoryBuffer.h>
> > +#include <llvm/Support/PathV1.h>
> > +#include <llvm/Target/TargetData.h>
> > +#include <llvm/Transforms/IPO/PassManagerBuilder.h>
> > +
> > +#include "util/u_memory.h"
> >
> > #include <iostream>
> > #include <iomanip>
> > #include <fstream>
> > #include <cstdio>
> > -#endif
> >
> > using namespace clover;
> >
> > -#if 0
> > namespace {
> > +#if 0
> > void
> > build_binary(const std::string &source, const std::string &target,
> > const std::string &name) {
> > @@ -78,17 +87,155 @@ namespace {
> > compat::istream cs(str);
> > return module::deserialize(cs);
> > }
> > -}
> > +#endif
> > +module
> > +build_module_llvm(const std::string &source, const std::string &name,
> > + enum pipe_shader_ir ir, const std::string &triple) {
>
> This function is huge and seems to take care of three completely
> unrelated tasks (compiling, linking, and building the symbol table),
> have you considered splitting it into several parts?
>
> > +
> > + /* Compile the kernel */
>
> Please use C++ comment syntax in C++ code.
>
> > + clang::CompilerInstance c;
> > + module m;
> > + clang::EmitLLVMOnlyAction act(&llvm::getGlobalContext());
> > + std::string log;
> > + llvm::raw_string_ostream s_log(log);
> > +
> > +#if HAVE_LLVM <= 0x0300
> > + c.getFrontendOpts().Inputs.push_back(
> > + std::make_pair(clang::IK_OpenCL, name));
> > +#else
> > + c.getFrontendOpts().Inputs.push_back(
> > + clang::FrontendInputFile(name, clang::IK_OpenCL));
> > +#endif
>
> Do we need this compatibility code? Don't you have to build a patched
> version of LLVM from sources for this to work anyway?
>
You only need a patch if you want to use the r600 target. This
compatibility code is necessary if you want to generate code for other
targets. I don't mind dropping it, but it would mean that clover won't
work with LLVM versions older than 3.1
> > + c.getFrontendOpts().ProgramAction = clang::frontend::EmitLLVMOnly;
> > + c.getHeaderSearchOpts().UseBuiltinIncludes = true;
> > +#if HAVE_LLVM < 0x0300
> > + c.getHeaderSearchOpts().UseStandardIncludes = true;
> > +#else
> > + c.getHeaderSearchOpts().UseStandardSystemIncludes = true;
> > +#endif
> > + c.getHeaderSearchOpts().ResourceDir = CLANG_RESOURCE_DIR;
> > +
> > + /* Add libclc generic search path */
> > + c.getHeaderSearchOpts().AddPath(LIBCLC_PATH "/generic/include/",
> > + clang::frontend::Angled,
> > + false, false, false);
> > +
> > + /* Add libclc include */
> > + c.getPreprocessorOpts().Includes.push_back("clc/clc.h");
> > + /* clc.h requires that this macro be defined: */
> > + c.getPreprocessorOpts().addMacroDef("cl_clang_storage_class_specifiers");
> > +
> > + c.getLangOpts().NoBuiltin = true;
> > + c.getTargetOpts().Triple = triple;
> > + c.getInvocation().setLangDefaults(clang::IK_OpenCL);
> > + c.createDiagnostics(0, NULL, new clang::TextDiagnosticPrinter(
> > + s_log, c.getDiagnosticOpts()));
> > +
> > + c.getPreprocessorOpts().addRemappedFile(
> > + "cl_input", llvm::MemoryBuffer::getMemBuffer(source));
> > +
> > + /* Compile the code */
> > + if (!c.ExecuteAction(act))
> > + throw build_error(log);
> > +
> > + /* Link the kernel with libclc */
> > + llvm::PassManager PM;
> > + llvm::PassManagerBuilder Builder;
> > + bool isNative;
> > + llvm::Module * mod = act.takeModule();
>
> This pointer/reference declaration syntax is inconsistent with the rest
> of the code.
>
> > + llvm::Linker linker("clover", mod);
> > +
> > + linker.LinkInFile(llvm::sys::Path(LIBCLC_PATH + triple + "/lib/builtins.bc"), isNative);
> > + mod = linker.releaseModule();
> > +
> > + /* Run link time optimizations */
> > + Builder.populateLTOPassManager(PM, false, true);
> > + Builder.OptLevel = 2;
> > + PM.run(*mod);
> > +
> > + /* Build the clover::module */
> > + switch (ir) {
> > + case PIPE_SHADER_IR_TGSI:
> > + /*XXX: Handle TGSI */
> > + assert(0);
> > + default: break;
> > + }
> > +
> > + unsigned char * prog;
> > + uint32_t prog_sz;
> > +
> > +#if HAVE_LLVM > 0x0300
> > + llvm::SmallVector<char, 1024> llvm_bitcode;
> > + llvm::raw_svector_ostream bitcode_ostream(llvm_bitcode);
> > +#else
> > + std::vector<unsigned char> llvm_bitcode;
> > +#endif
> > + llvm::BitstreamWriter writer(llvm_bitcode);
> > +
> > +#if HAVE_LLVM <= 0x0300
> > + llvm::WriteBitcodeToStream(mod, writer);
> > +#else
> > + llvm::WriteBitcodeToFile(mod, bitcode_ostream);
> > + bitcode_ostream.flush();
> > #endif
> >
> > + prog_sz = llvm_bitcode.size() * sizeof(unsigned char);
> > +
> > + /* We need to add 4 to the program size, because we will
> > + * be preprending the length of the program to the bitcode string. */
> > + prog = (unsigned char *)MALLOC(prog_sz + 4);
> > + ((uint32_t *)prog)[0] = prog_sz;
> > + memcpy(prog + 4, &llvm_bitcode[0], prog_sz);
> > +
> It would be nice to have some sort of struct for this.
>
> > + std::string kernel_name;
> > + compat::vector<module::argument> args;
> > + const llvm::NamedMDNode * kernel_node =
> > + mod->getNamedMetadata("opencl.kernels");
> > + /* XXX: Support more than one kernel */
> > + /* XXX: Error if there are no kernels */
>
> Nope, there's no need to error here if there are no kernels defined.
>
> > + assert(kernel_node->getNumOperands() == 1);
> > +
> > + llvm::Function * kernel_func = llvm::dyn_cast<llvm::Function>(
> > + kernel_node->getOperand(0)->getOperand(0));
> > + kernel_name = kernel_func->getName();
> > +
> > + for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
> > + E = kernel_func->arg_end(); I != E; ++I) {
> > + llvm::Argument & arg = *I;
> > + llvm::Type * arg_type = arg.getType();
> > + llvm::TargetData TD(kernel_func->getParent());
> > + unsigned arg_size = TD.getTypeStoreSize(arg_type);
> > +
> > + if (llvm::isa<llvm::PointerType>(arg_type) and arg.hasByValAttr()) {
>
> "and"? Please use "&&".
>
> > + arg_type =
> > + llvm::dyn_cast<llvm::PointerType>(arg_type)->getElementType();
> > + }
> > +
> > + if (arg_type->isPointerTy()) {
> > + /* XXX: Figure out LLVM->OpenCL address space mappings for each
> > + * target. I think we need to ask clang what these are. For now,
> > + * pretend everything is in the global address space. */
>
> Aren't these supposed to be defined arbitrarily by the standard library
> using "address_space" attributes?
>
The OpenCL address space mappings are one of the attributes of a target
in Clang, they aren't defined in the standard library.
> > + unsigned address_space = llvm::cast<llvm::PointerType>(arg_type)->getAddressSpace();
> > + switch (address_space) {
> > + default:
> > + args.push_back(module::argument(module::argument::global, arg_size));
> > + break;
> > + }
> > + } else {
> > + args.push_back(module::argument(module::argument::scalar, arg_size));
> > + }
> > + }
> > + m.syms.push_back(module::symbol(kernel_name, 0, 0, args ));
> > + m.secs.push_back(module::section(0, module::section::text, prog_sz + 4,
> > + compat::vector<char>((char *)prog, prog_sz + 4)));
> > + return m;
> > + }
> > +}
> > +
> > module
> > clover::compile_program_llvm(const compat::string &source,
> > - const compat::string &target) {
> > -#if 0
> > - build_binary(source, target, "cl_input");
> > - module m = load_binary("cl_input.o");
> > - std::remove("cl_input.o");
> > - return m;
> > -#endif
> > - return module();
> > + enum pipe_shader_ir ir,
> > + const compat::string &triple) {
> > +
> > + return build_module_llvm(source, "cl_input", ir, triple);
> > }
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list