[Beignet] [PATCH 1/2] support __gen_ocl_simd_any and __gen_ocl_simd_all
Zhigang Gong
zhigang.gong at linux.intel.com
Mon Apr 21 20:25:41 PDT 2014
Two minor comments. I fixed it and pushed to the master branch.
Thanks.
On Fri, Apr 18, 2014 at 01:42:16PM +0800, Guo Yejun wrote:
> short __gen_ocl_simd_any(short x):
> if x in any of the active threads in the same SIMD is not zero,
> the return value for all these threads is not zero, otherwise, zero returned.
>
> short __gen_ocl_simd_all(short x):
> only if x in all of the active threads in the same SIMD is not zero,
> the return value for all these threads is not zero, otherwise, zero returned.
>
> for example:
> to check if a special value exists in a global buffer, use one SIMD
> to do the searching parallelly, the whole SIMD can stop the task
> once the value is found. The key kernel code looks like:
>
> for(; ; ) {
> ...
> if (__gen_ocl_simd_any(...))
> break; //the whole SIMD stop the searching
> }
>
> Signed-off-by: Guo Yejun <yejun.guo at intel.com>
> ---
> backend/src/backend/gen_insn_selection.cpp | 63 ++++++++++++++++++++++++++++++
> backend/src/ir/instruction.hpp | 4 ++
> backend/src/ir/instruction.hxx | 2 +
> backend/src/llvm/llvm_gen_backend.cpp | 16 ++++++++
> backend/src/llvm/llvm_gen_ocl_function.hxx | 4 ++
> backend/src/ocl_stdlib.tmpl.h | 8 ++++
> 6 files changed, 97 insertions(+)
>
> diff --git a/backend/src/backend/gen_insn_selection.cpp b/backend/src/backend/gen_insn_selection.cpp
> index 72a8549..e7c84d0 100644
> --- a/backend/src/backend/gen_insn_selection.cpp
> +++ b/backend/src/backend/gen_insn_selection.cpp
> @@ -1730,6 +1730,69 @@ namespace gbe
> case ir::OP_SQR: sel.MATH(dst, GEN_MATH_FUNCTION_SQRT, src); break;
> case ir::OP_RSQ: sel.MATH(dst, GEN_MATH_FUNCTION_RSQ, src); break;
> case ir::OP_RCP: sel.MATH(dst, GEN_MATH_FUNCTION_INV, src); break;
> + case ir::OP_SIMD_ANY:
> + {
> + const GenRegister constZero = GenRegister::immuw(0);;
> + const GenRegister regOne = GenRegister::uw1grf(ir::ocl::one);
> + const GenRegister flag01 = GenRegister::flag(0, 1);
> +
> + sel.push();
> + int simdWidth = sel.curr.execWidth;
> + sel.curr.predicate = GEN_PREDICATE_NONE;
> + sel.curr.execWidth = 1;
> + sel.curr.noMask = 1;
> + sel.MOV(flag01, constZero);
> +
> + sel.curr.execWidth = simdWidth;
> + sel.curr.noMask = 0;
> +
> + sel.curr.physicalFlag = 1;
No need to set physicalFlag to 1 as 1 is default value.
> + sel.curr.flag = 0;
> + sel.curr.subFlag = 1;
> + sel.CMP(GEN_CONDITIONAL_NEQ, src, constZero);
> +
> + if (sel.curr.execWidth == 16)
> + sel.curr.predicate = GEN_PREDICATE_ALIGN1_ANY16H;
> + else if (sel.curr.execWidth == 8)
> + sel.curr.predicate = GEN_PREDICATE_ALIGN1_ANY8H;
> + else
> + NOT_IMPLEMENTED;
> + sel.SEL(dst, regOne, constZero);
> + sel.pop();
> + }
> + break;
> + case ir::OP_SIMD_ALL:
> + {
> + const GenRegister constZero = GenRegister::immuw(0);
> + const GenRegister regOne = GenRegister::uw1grf(ir::ocl::one);
> + const GenRegister flag01 = GenRegister::flag(0, 1);
> +
> + sel.push();
> + int simdWidth = sel.curr.execWidth;
> + sel.curr.predicate = GEN_PREDICATE_NONE;
> + sel.curr.execWidth = 1;
> + sel.curr.noMask = 1;
> + sel.MOV(flag01, regOne);
> +
> + sel.curr.execWidth = simdWidth;
> + sel.curr.noMask = 0;
> +
> + sel.curr.physicalFlag = 1;
Ditto.
> + sel.curr.flag = 0;
> + sel.curr.subFlag = 1;
More information about the Beignet
mailing list