[Beignet] [PATCH] libocl: refine implementation of normalize().

Thu Jan 29 17:56:06 PST 2015

On Fri, Jan 30, 2015 at 02:29:51AM +0000, Song, Ruiling wrote:
> > > y) { return length(x-y); }  OVERLOADABLE float normalize(float x) {
> > > -  union { float f; unsigned u; } u;
> > > -  u.f = x;
> > > -  if(u.u == 0)
> > > -    return 0.f;
> > > -  if(isnan(x))
> > > -    return NAN;
> > > -  return u.u < 0x7fffffff ? 1.f : -1.f;
> > > +  float m = length(x);
> > > +  m = m == 0.0f ? 1.0f : m;
> > > +  return x / m;
> > >  }
> > >  OVERLOADABLE float2 normalize(float2 x) {
> > >    float m = length(x);
> > > -  if(m == 0)
> > > -    return 0;
> > > +  m = m == 0.0f ? 1.0f : m;
> > >    return x / m;
> > >  }
> > 
> > Although eliminate branching, but introduce one more div if the length is
> > zero. If a test case has many zero vectors, then this patch may bring some
> > performance regression, as division is relatively expensive.
> > 
> > Any thoughts?
> I think what we should optimize for the most commonly used scenario. For real applications, most data is not zero I think.
> After my change, the asm will be like:

I agree that for non-zero length vectors, this patch looks great.
I'm just a little bit worry about the zero length cases. And want to discuss
whether there is an even better way?

I noticed Rong's comment:
"
 float2 t = m == 0.0f ? x : x/m;
 return t;
"

Actually this way doesn't solve the issue. As x/m is not a pre-existed value, it will
not eliminate the if conditional blocks and will generate the same instructions as
the "if (m==0) return 0" case.

So, I will accept this patch based on there is no better way.

Thanks for the patch.

>     (      27)  cmp.e(8)        g110<1>:F       g112<8,8,1>:F   6.91076e-310F   { align1 WE_normal 1Q };
>     (      29)  cmp.e(8)        g111<1>:F       g113<8,8,1>:F   6.91076e-310F   { align1 WE_normal 2Q };
>     (      31)  (-f0) sel(16)   g108<1>:F       g112<8,8,1>:F   6.91076e-310F   { align1 WE_normal 1H };
>     (      33)  math fdiv(16)   g106<1>:F       g114<8,8,1>:F   g108<8,8,1>:F   { align1 WE_normal 1H };
> If it is written like
> If(m == 0) return 0;
> The generated asm will be like below: it will introduce some if/endif instructions which will hurt performance for non-zero data.
> Although the below asm seems need to be optimized. But it is hard to completely remove if/endif/comp.le instructions.
> That is why I choose to make the change. Any further comment?
> 
>     (      25)  cmp.e(8)        g110<1>:F       g112<8,8,1>:F   6.90196e-310F   { align1 WE_normal 1Q };
>     (      27)  cmp.e(8)        g111<1>:F       g113<8,8,1>:F   6.90196e-310F   { align1 WE_normal 2Q };
>     (      29)  (+f0) sel(16)   g126<1>:UW      g8.2<0,1,0>:UW  g8<0,1,0>:UW    { align1 WE_normal 1H };
>     (      31)  mov(16)         g108<1>:F       g127.6<0,1,0>:F                 { align1 WE_normal 1H };
>     (      33)  cmp.ne(16)      null:UW         g126<8,8,1>:UW  0x0UW           { align1 WE_normal 1H switch };
>     (      35)  (-f0) if(16) 4                                                  { align1 WE_normal 1H };
>   L1:
>     (      37)  math fdiv(16)   g108<1>:F       g114<8,8,1>:F   g112<8,8,1>:F   { align1 WE_normal 1H };
>     (      39)  endif(16) 2                     null                            { align1 WE_all 1H };
>     (      41)  endif(16) 2                     null                            { align1 WE_normal 1H };
>   L2:
>     (      43)  cmp.le(16)      null:UW         g1<8,8,1>:UW    0x2UW           { align1 WE_all 1H switch };
>     (      45)  (+f0) if(16) 8                                                  { align1 WE_normal 1H };
> > 
> > >  OVERLOADABLE float3 normalize(float3 x) {
> > >    float m = length(x);
> > > -  if(m == 0)
> > > -    return 0;
> > > +  m = m == 0.0f ? 1.0f : m;
> > >    return x / m;
> > >  }
> > >  OVERLOADABLE float4 normalize(float4 x) {
> > >    float m = length(x);
> > > -  if(m == 0)
> > > -    return 0;
> > > +  m = m == 0.0f ? 1.0f : m;
> > >    return x / m;
> > >  }
> > >
> > > --
> > > 1.7.10.4
> > >
> > > _______________________________________________
> > > Beignet mailing list
> > > Beignet at lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/beignet