[Beignet] [PATCH 3/3] Backend: Optimization of internal math functions
Song, Ruiling
ruiling.song at intel.com
Thu May 12 08:14:19 UTC 2016
>
> /* sin(Inf or NaN) is NaN */
> - if (ix>=0x7f800000) return x-x;
> + if (ix >= 0x7f800000) return x-x;
>
> - /* argument reduction needed */
> + if(x <= pio4)
> + return negative * __kernel_sinf(x);
> + /* argument reduction needed */
I think it is better we remove this (x < pio4) branch.
Let's keep the implementation less divergent. What do you think?
> else {
> n = __ieee754_rem_pio2f(x,&y);
> float s = __kernel_sinf(y);
> @@ -612,10 +605,12 @@ OVERLOADABLE float sin(float x) {
> }
> }
>
> -OVERLOADABLE float cos(float x) {
> +OVERLOADABLE float cos(float x)
> +{
> if (__ocl_math_fastpath_flag)
> return __gen_ocl_internal_fastpath_cos(x);
>
> + const float pio4 = 7.8539812565e-01; /* 0x3f490fda */
> float y,z=0.0;
> int n, ix;
> x = __gen_ocl_fabs(x);
> @@ -624,9 +619,11 @@ OVERLOADABLE float cos(float x) {
> ix &= 0x7fffffff;
>
> /* cos(Inf or NaN) is NaN */
> - if (ix>=0x7f800000) return x-x;
> + if (ix >= 0x7f800000) return x-x;
>
> - /* argument reduction needed */
> + if(x <= pio4)
> + return __kernel_cosf(x, 0.f);
> + /* argument reduction needed */
Same as above.
Other parts of the patch looks very great to me.
Thanks!
Ruiling
More information about the Beignet
mailing list