[Mesa-dev] [PATCH 05/14] swr: [rasterizer jitter] cleanup supporting different llvm versions

Fri Jun 17 20:36:39 UTC 2016

Hi Tim,

Does this commit allows us to have generic generated files or it's
solely workaround the VPERMD intrinsic argument swap ?

On 17 June 2016 at 20:25, Tim Rowley <timothy.o.rowley at intel.com> wrote:

> @@ -31,8 +31,8 @@
>  #pragma warning(disable: 4800 4146 4244 4267 4355 4996)
>  #endif
>
> -#include "jit_api.h"
>  #include "JitManager.h"
> +#include "jit_api.h"
>  #include "fetch_jit.h"
>
Is this due to the DEBUG redefinition ? One might want to add a
comment, to prevent the next person from moving/breaking things.

> @@ -222,35 +222,6 @@ void JitManager::SetupNewModule()
>      mIsModuleFinalized = false;
>  }
>
> -//////////////////////////////////////////////////////////////////////////
> -/// @brief Create new LLVM module from IR.
> -bool JitManager::SetupModuleFromIR(const uint8_t *pIR)
> -{
> -    std::unique_ptr<MemoryBuffer> pMem = MemoryBuffer::getMemBuffer(StringRef((const char*)pIR), "");
> -
> -    SMDiagnostic Err;
> -    std::unique_ptr<Module> newModule = parseIR(pMem.get()->getMemBufferRef(), Err, mContext);
> -
> -    if (newModule == nullptr)
> -    {
> -        SWR_ASSERT(0, "Parse failed! Check Err for details.");
> -        return false;
> -    }
> -
> -    mpCurrentModule = newModule.get();
> -#if defined(_WIN32)
> -    // Needed for MCJIT on windows
> -    Triple hostTriple(sys::getProcessTriple());
> -    hostTriple.setObjectFormat(Triple::ELF);
> -    newModule->setTargetTriple(hostTriple.getTriple());
> -#endif // _WIN32
> -
> -    mpExec->addModule(std::move(newModule));
> -    mIsModuleFinalized = false;
> -
> -    return true;
> -}
> -
>
This (and the respective prototype) seems to be completely unrelated cleanup.

> -#if HAVE_LLVM == 0x306
> +#if HAVE_LLVM <= 0x306
SWR requires LLVM 3.6 so all of these changes are not needed.

>  // llvm 3.7+ reuses "DEBUG" as an enum value
> +#if defined(DEBUG)
>  #pragma push_macro("DEBUG")
>  #undef DEBUG
> +#define _DEBUG_COLLISSION
> +#endif // DEBUG
>
I believe Jose mentioned a slightly better way to handle this (see
src/gallium/auxiliary/gallivm/lp_bld_misc.cpp). Or that does not apply
here ? Also it would be nice to keep it a separate patch as you did
earlier.

>  #ifndef HAVE_LLVM
> -#define HAVE_LLVM (LLVM_VERSION_MAJOR << 8) || LLVM_VERSION_MINOR
> +#define HAVE_LLVM ((LLVM_VERSION_MAJOR << 8) | LLVM_VERSION_MINOR)
>  #endif
>
Unrelated bugfix, stable material.

> @@ -322,6 +322,32 @@ CallInst *Builder::CALL(Value *Callee, const std::initializer_list<Value*> &args
>      return CALLA(Callee, args);
>  }
>
> +#if HAVE_LLVM > 0x306
> +CallInst *Builder::CALL(Value *Callee, Value* arg)
> +{
> +       std::vector<Value*> args;
> +       args.push_back(arg);
> +       return CALLA(Callee, args);
> +}
> +
> +CallInst *Builder::CALL2(Value *Callee, Value* arg1, Value* arg2)
> +{
> +       std::vector<Value*> args;
> +       args.push_back(arg1);
> +       args.push_back(arg2);
> +       return CALLA(Callee, args);
> +}
> +
> +CallInst *Builder::CALL3(Value *Callee, Value* arg1, Value* arg2, Value* arg3)
> +{
> +       std::vector<Value*> args;
> +       args.push_back(arg1);
> +       args.push_back(arg2);
> +       args.push_back(arg3);
> +       return CALLA(Callee, args);
> +}
> +#endif
> +
These seem unused ?

> @@ -726,8 +752,12 @@ Value *Builder::PERMD(Value* a, Value* idx)
>      // use avx2 permute instruction if available
>      if(JM()->mArch.AVX2())
>      {
> -        // llvm 3.6.0 swapped the order of the args to vpermd
> -        res = VPERMD(idx, a);
> +#if (HAVE_LLVM == 0x306) && (LLVM_VERSION_PATCH == 0)
> +               // llvm 3.6.0 swapped the order of the args to vpermd
> +               res = VPERMD(idx, a);
> +#else
> +               res = VPERMD(a, idx);
> +#endif
Something feels rather fishy here -  atm generator was always creates
one 'version' of the intrinsic and the code directly uses is.

With this patch (taking that I understood the python script correctly):
 - Here we swap the order depending of the LLVM used to compile SWR
 - In the generator (below) we also swap the order.

If the same version of LLVM is used to generate and compile. if not
things just go boom. Considering that this is the gist of the patch,
wouldn't it be nice to have a bit more comprehensive comment be that
here, below and/or in the commit message ?

> @@ -374,8 +374,17 @@ def main():
>      parser.add_argument("--gen_cpp", "-gen_cpp", help="Generate builder_gen.cpp", action="store_true", default=False)
>      parser.add_argument("--gen_x86_h", "-gen_x86_h", help="Generate x86 intrinsics. No input is needed.", action="store_true", default=False)
>      parser.add_argument("--gen_x86_cpp", "-gen_x86_cpp", help="Generate x86 intrinsics. No input is needed.", action="store_true", default=False)
> +    parser.add_argument("--llvm-version", help="What LLVM version to generate for", action="store", default="3.6.0")
> +
>      args = parser.parse_args()
>
> +    if args.llvm_version == "3.6.0":
> +        # Swap args for VPERMD
> +        for i in range(len(intrinsics)):
> +            if intrinsics[i][0] == "VPERMD":
> +                intrinsics[i][2] = [intrinsics[i][2][1], intrinsics[i][2][0]]
> +                break
> +
If one needs this hunk, may I suggest that one drops the default value
and just errors out if the caller of the script hasn't provided a
llvm-version ?

Thanks
Emil