[Beignet] [PATCH 3/3] Backend: Optimization of internal math functions

Song, Ruiling ruiling.song at intel.com
Thu May 12 08:14:19 UTC 2016


> 
>      /* sin(Inf or NaN) is NaN */
> -  if (ix>=0x7f800000) return x-x;
> +  if (ix >= 0x7f800000) return x-x;
> 
> -    /* argument reduction needed */
> +  if(x <= pio4)
> +	  return negative * __kernel_sinf(x);
> +  /* argument reduction needed */
I think it is better we remove this (x < pio4) branch. 
Let's keep the implementation less divergent. What do you think?

>    else {
>        n = __ieee754_rem_pio2f(x,&y);
>        float s = __kernel_sinf(y);
> @@ -612,10 +605,12 @@ OVERLOADABLE float sin(float x) {
>    }
>  }
> 
> -OVERLOADABLE float cos(float x) {
> +OVERLOADABLE float cos(float x)
> +{
>    if (__ocl_math_fastpath_flag)
>      return __gen_ocl_internal_fastpath_cos(x);
> 
> +  const float pio4  =  7.8539812565e-01; /* 0x3f490fda */
>    float y,z=0.0;
>    int n, ix;
>    x = __gen_ocl_fabs(x);
> @@ -624,9 +619,11 @@ OVERLOADABLE float cos(float x) {
>    ix &= 0x7fffffff;
> 
>      /* cos(Inf or NaN) is NaN */
> -  if (ix>=0x7f800000) return x-x;
> +  if (ix >= 0x7f800000) return x-x;
> 
> -    /* argument reduction needed */
> +  if(x <= pio4)
> +	  return __kernel_cosf(x, 0.f);
> +  /* argument reduction needed */

Same as above.
Other parts of the patch looks very great to me.

Thanks!
Ruiling



More information about the Beignet mailing list