[Mesa-dev] [PATCH 09/10] radeonsi: don't emit AMDGPU intrinsics for RSQ opcodes

Sun Oct 11 09:28:36 PDT 2015

> On Oct 10, 2015, at 6:29 PM, Marek Olšák <maraeo at gmail.com> wrote:
> 
> +/* This requires "unsafe-fp-math" for LLVM to convert it to RSQ. */
> +static void emit_rsq(const struct lp_build_tgsi_action *action,
> +		     struct lp_build_tgsi_context *bld_base,
> +		     struct lp_build_emit_data *emit_data)
> +{
> +	LLVMBuilderRef builder = bld_base->base.gallivm->builder;
> +	LLVMValueRef src = emit_data->args[0];
> +	bool is_f64 = LLVMGetTypeKind(LLVMTypeOf(src)) == LLVMDoubleTypeKind;
> +
> +	LLVMValueRef sqrt =
> +		lp_build_emit_llvm_unary(bld_base,
> +					 is_f64 ? TGSI_OPCODE_DSQRT
> +						: TGSI_OPCODE_SQRT,
> +					 src);
> +
> +	emit_data->output[emit_data->chan] =
> +		LLVMBuildFDiv(builder,
> +			      is_f64 ? bld_base->dbl_bld.one
> +				     : bld_base->base.one,
> +			      sqrt, "");
> +}

You should add the per-instruction fast math flags here for nnan instead of just relying on the function attribute (although to get the codegen effect currently you will still need the attribute). I’m also not sure how to do this with the C API (and might need new functions to do it)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151011/85682871/attachment-0001.html>