[Liboil] exp, log

Stephane Fillod f8cfe at free.fr
Tue Mar 21 14:20:59 PST 2006


On Thu, Mar 16, 2006 at 11:38:40AM -0500, James Bergstra wrote:
> > > The problem which motivates this email is that if you write a loop like
> > > 
> > > for (int i = 0; i < 256; ++i) y[i] = exp(x[i]);
> > > 
> > > and compile it with gcc, you get code that runs much slower than if you make a
> > > call to amd's libacml_mv
> > 
> > ..because amd's libacml_mv is doing what liboil should have done ;-)
> > BTW, I was not aware of the amd's libacml_mv. Can you talk a bit more
> > about it?
> 
> libacml_mv (and libacml) is something like intel's math kernel library.
> libacml_mv has vectorized exp, log, sin, cos, pow.
> The greater part of the library is libacml; it is a complete BLAS
> implementation, has FFT functions, and does random number generation.
> 
> Anyway, the way they do libacml_mv is based on the fact that these
> transcendental functions are expensive to compute.
> So what they did was implement 1, 2, and 4-element versions of each function.
> The 'array' versions just take care of loading 4-elements at a time, calling
> __vrd4_exp and then writing back to memory (possibly skipping over the cache).

Interesting. If I'm not mistaken, libacml_mv is to AMD what IPP is to Intel.
Right?

> > > My question for the list is whether you think liboil might tackle the problem of
> > > vectorized functions like exp, log, cos, sin, sincos, cabs, etc... or whether
> > > these should be left for a compiler?   I believe that the answer should be
> > > "both"... first liboil, and the compiler :)
> > 
> > My answer would be, if your application has a need for it, then go ahead and
> > add it to liboil.
> > 
> > BTW, I have a patch in my tree which adds sin/cos/sincos (fixpoint with float
> > result) speedup. They are much needed in some application (DSP, ..). I can send
> > the ref/ API proposal as a starting point.
> 
> I would like to see it!  

Ok, please find it attached: sincosmultsum.patch
It has sin, cos, interleaved sincos, multsum (non-strided) aka dotproduct
in real/complex flavors, and a special combined sincos with complex multsum.

Here's a ChangeLog entry if the patch suits the liboil committers:

2006-03-21  Stephane Fillod  <f8cfe at free.fr>
	* liboil/ref/Makefile.am: new classes
	* liboil/ref/multsum.c: added integer flavors
	* liboil/ref/complexmultsum.c, liboil/ref/complexmultsum_real.c
	  liboil/ref/cos.c, liboil/ref/sin.c, liboil/ref/multsum_ns.c,
	  liboil/ref/sincos_interleaved.c, liboil/ref/sincoscomplexmult.c: new

The patch does not contain the optimized versions. They will come later. The
gain can be tremendous.

> I've already written the functions that I would like to contribute to liboil,
> and they are open for viewing at
> 
> https://savannah.nongnu.org/projects/libmsl/

Nice! I'm curious how fast it compares to other speedups.

To be able to contribute to liboil, you will have to "downgrade"
the license covering your GPL code to liboil's license (kind of so-called
"2 clauses license"). For some of my speedups, this is still a problem
to me, and I have yet to find a solution.

> I do exactly as you say. My function computes an arbitrarily long array using
> __vrd4_exp, and then __vrd1_exp as necessary at the end.  I would like to not
> depend at all on libacml_mv, because it is not open... but this can be left for
> later :(

What do you mean by not open ? Not open source? Is it at least free (as
in beer) ? In that case, it could still be wrapped, as an option where
available. Then liboil is pretty neat to compare the fitness of a 
class breed :^)

-- 
Stephane
-------------- next part --------------
Index: liboil/ref/Makefile.am
===================================================================
RCS file: /cvs/liboil/liboil/liboil/ref/Makefile.am,v
retrieving revision 1.11
diff -u -b -B -w -p -r1.11 Makefile.am
--- liboil/ref/Makefile.am	5 Mar 2006 03:58:25 -0000	1.11
+++ liboil/ref/Makefile.am	21 Mar 2006 21:57:46 -0000
@@ -11,10 +11,13 @@ c_sources = \
 	argb_paint.c \
 	ayuv2argb.c \
 	clamp.c \
+	complexmultsum.c \
+	complexmultsum_real.c \
 	composite.c \
 	convert.c \
 	copy.c \
 	copy8x8.c \
+	cos.c \
 	diff8x8.c \
 	diffsquaresum_f64.c \
 	error8x8.c \
@@ -23,6 +26,7 @@ c_sources = \
 	mt19937ar.c \
 	mult8x8_s16.c \
 	multsum.c \
+	multsum_ns.c \
 	recon8x8.c \
 	resample.c \
 	rgb.c \
@@ -30,7 +34,10 @@ c_sources = \
 	sad8x8.c \
 	sad8x8_broken.c \
 	sad8x8avg.c \
+	sin.c \
 	sincos_f64.c \
+	sincoscomplexmult.c \
+	sincos_interleaved.c \
 	splat.c \
 	squaresum_f64.c \
 	sum_f64.c \
Index: liboil/ref/multsum.c
===================================================================
RCS file: /cvs/liboil/liboil/liboil/ref/multsum.c,v
retrieving revision 1.3
diff -u -b -B -w -p -r1.3 multsum.c
--- liboil/ref/multsum.c	16 Dec 2005 07:45:29 -0000	1.3
+++ liboil/ref/multsum.c	21 Mar 2006 20:49:14 -0000
@@ -1,6 +1,7 @@
 /*
  * LIBOIL - Library of Optimized Inner Loops
  * Copyright (c) 2003,2004 David A. Schleef <ds at schleef.org>
+ * Copyright (c) 2005 Stephane Fillod <f8cfe at free.fr>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -30,11 +31,11 @@
 #endif
 
 #include <liboil/liboilfunction.h>
-#include <liboil/simdpack/simdpack.h>
+#include <liboil/liboilclasses.h>
 #include <math.h>
 
 
-#define MULTSUM_DEFINE_REF(type)	\
+#define MULTSUM_DEFINE_REF(type,sum_type)	\
 static void multsum_ ## type ## _ref(	\
     oil_type_ ## type *dest,		\
     oil_type_ ## type *src1, int sstr1,	\
@@ -42,8 +43,8 @@ static void multsum_ ## type ## _ref(	\
     int n)				\
 {					\
   int i;				\
-  double sum = 0;			\
-  double errsum = 0;			\
+  sum_type sum = 0;			\
+  sum_type errsum = 0;			\
   for(i=0;i<n;i++){			\
     oil_type_ ## type x;                    \
     oil_type_ ## type tmp;                  \
@@ -73,7 +74,7 @@ OIL_DEFINE_CLASS (multsum_ ## type, \
  * Multiplies each element in @src1 and @src2 and sums the results
  * over the entire array, and places the sum into @dest.
  */
-MULTSUM_DEFINE_REF(f32);
+MULTSUM_DEFINE_REF(f32,double);
 /**
  * oil_multsum_f64:
  * @dest:
@@ -86,5 +87,44 @@ MULTSUM_DEFINE_REF(f32);
  * Multiplies each element in @src1 and @src2 and sums the results
  * over the entire array, and places the sum into @dest.
  */
-MULTSUM_DEFINE_REF(f64);
+MULTSUM_DEFINE_REF(f64,double);
+/**
+ * oil_multsum_u8:
+ * @dest:
+ * @src1:
+ * @sstr1:
+ * @src2:
+ * @sstr2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+MULTSUM_DEFINE_REF(u8,int);
+/**
+ * oil_multsum_s16:
+ * @dest:
+ * @src1:
+ * @sstr1:
+ * @src2:
+ * @sstr2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+MULTSUM_DEFINE_REF(s16,int32_t);
+/**
+ * oil_multsum_s32:
+ * @dest:
+ * @src1:
+ * @sstr1:
+ * @src2:
+ * @sstr2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+MULTSUM_DEFINE_REF(s32,int64_t);
 
--- /dev/null	2006-03-20 21:40:10.487666816 +0100
+++ liboil/ref/complexmultsum.c	2005-12-17 01:04:10.000000000 +0100
@@ -0,0 +1,136 @@
+/*
+ * LIBOIL - Library of Optimized Inner Loops
+ * Copyright (c) 2003,2004 David A. Schleef <ds at schleef.org>
+ * Copyright (c) 2005 Stephane Fillod <f8cfe at free.fr>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include <liboil/liboilfunction.h>
+#include <liboil/liboilclasses.h>
+#include <math.h>
+
+
+#define COMPLEXMULTSUM_DEFINE_REF(type,sum_type)	\
+static void complexmultsum_ ## type ## _ref(	\
+    oil_type_ ## type *dest,		\
+    oil_type_ ## type *src1,		\
+    oil_type_ ## type *src2,		\
+    int n)				\
+{					\
+  int i;				\
+  sum_type sum1 = 0;			\
+  sum_type sum2 = 0;			\
+  sum_type errsum1 = 0;			\
+  sum_type errsum2 = 0;			\
+  for(i=0;i<n;i++){			\
+    oil_type_ ## type a, b, c, d;		\
+    oil_type_ ## type x, y;			\
+    oil_type_ ## type tmp1, tmp2;		\
+    a = src1[2*i + 0];			\
+    b = src1[2*i + 1];			\
+    c = src2[2*i + 0];			\
+    d = src2[2*i + 1];			\
+    tmp1 = sum1;			\
+    tmp2 = sum2;			\
+    x = a * c - b * d;			\
+    y = a * d + b * c;			\
+    sum1 += x;				\
+    sum2 += y;				\
+    errsum1 += (tmp1 - sum1) + x;	\
+    errsum2 += (tmp2 - sum2) + y;	\
+  }					\
+  dest[0] = sum1 + errsum1;		\
+  dest[1] = sum2 + errsum2;		\
+}					\
+OIL_DEFINE_IMPL_REF (complexmultsum_ ## type ## _ref, complexmultsum_ ## type); \
+OIL_DEFINE_CLASS (complexmultsum_ ## type, \
+    "oil_type_" #type " *d_2, "		\
+    "oil_type_" #type " *s1_2xn, "	\
+    "oil_type_" #type " *s2_2xn, "	\
+    "int n")
+
+/**
+ * oil_complexmultsum_f32:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_DEFINE_REF(f32,double);
+
+/**
+ * oil_complexmultsum_f64:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_DEFINE_REF(f64,double);
+
+/**
+ * oil_complexmultsum_u8:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_DEFINE_REF(u8,int);
+
+/**
+ * oil_complexmultsum_s16:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_DEFINE_REF(s16,int32_t);
+
+/**
+ * oil_complexmultsum_s32:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_DEFINE_REF(s32,int64_t);
+
--- /dev/null	2006-03-20 21:40:10.487666816 +0100
+++ liboil/ref/complexmultsum_real.c	2005-12-17 01:04:42.000000000 +0100
@@ -0,0 +1,135 @@
+/*
+ * LIBOIL - Library of Optimized Inner Loops
+ * Copyright (c) 2003,2004 David A. Schleef <ds at schleef.org>
+ * Copyright (c) 2005 Stephane Fillod <f8cfe at free.fr>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include <liboil/liboilfunction.h>
+#include <liboil/liboilclasses.h>
+#include <math.h>
+
+
+#define COMPLEXMULTSUM_REAL_DEFINE_REF(type,sum_type)	\
+static void complexmultsum_real_ ## type ## _ref(	\
+    oil_type_ ## type *dest,		\
+    oil_type_ ## type *src1,		\
+    oil_type_ ## type *src2,		\
+    int n)				\
+{					\
+  int i;				\
+  sum_type sum1 = 0;			\
+  sum_type sum2 = 0;			\
+  sum_type errsum1 = 0;			\
+  sum_type errsum2 = 0;			\
+  for(i=0;i<n;i++){			\
+    oil_type_ ## type a, b, c;		\
+    oil_type_ ## type x, y;			\
+    oil_type_ ## type tmp1, tmp2;		\
+    a = src1[2*i + 0];			\
+    b = src1[2*i + 1];			\
+    c = src2[i];			\
+    tmp1 = sum1;			\
+    tmp2 = sum2;			\
+    x = a * c;				\
+    y = b * c;				\
+    sum1 += x;				\
+    sum2 += y;				\
+    errsum1 += (tmp1 - sum1) + x;	\
+    errsum2 += (tmp2 - sum2) + y;	\
+  }					\
+  dest[0] = sum1 + errsum1;		\
+  dest[1] = sum2 + errsum2;		\
+}					\
+OIL_DEFINE_IMPL_REF (complexmultsum_real_ ## type ## _ref, complexmultsum_real_ ## type); \
+OIL_DEFINE_CLASS (complexmultsum_real_ ## type, \
+    "oil_type_" #type " *d_2, "		\
+    "oil_type_" #type " *s1_2xn, "	\
+    "oil_type_" #type " *s2_n, "	\
+    "int n")
+
+/**
+ * oil_complexmultsum_real_f32:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_REAL_DEFINE_REF(f32,double);
+
+/**
+ * oil_complexmultsum_real_f64:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_REAL_DEFINE_REF(f64,double);
+
+/**
+ * oil_complexmultsum_real_u8:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_REAL_DEFINE_REF(u8,int);
+
+/**
+ * oil_complexmultsum_real_s16:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_REAL_DEFINE_REF(s16,int32_t);
+
+/**
+ * oil_complexmultsum_real_s32:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+COMPLEXMULTSUM_REAL_DEFINE_REF(s32,int64_t);
+
--- /dev/null	2006-03-20 21:40:10.487666816 +0100
+++ liboil/ref/cos.c	2006-03-01 01:10:25.000000000 +0100
@@ -0,0 +1,155 @@
+/*
+ * LIBOIL - Library of Optimized Inner Loops
+ * Copyright (c) 2003,2004 David A. Schleef <ds at schleef.org>
+ * Copyright (c) 2005 Stephane Fillod <f8cfe at free.fr>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include <liboil/liboilfunction.h>
+#include <liboil/liboilclasses.h>
+#include <math.h>
+
+#define COS_DEFINE_REF(type)	\
+static void cos_ ## type ## _ref(	\
+    oil_type_ ## type *dest,		\
+    const oil_type_ ## type *offset,	\
+    const oil_type_ ## type *interval,	\
+    const double *amplitude,		\
+    int n)				\
+{					\
+  int i;				\
+  for(i=0;i<n;i++){			\
+    oil_type_ ## type x;                    \
+    x = *offset + *interval * i;	\
+    dest[i] = (oil_type_ ## type) (cos(x) * *amplitude);	\
+  }					\
+}					\
+OIL_DEFINE_IMPL_REF (cos_ ## type ## _ref, cos_ ## type); \
+OIL_DEFINE_CLASS (cos_ ## type, \
+    "oil_type_" #type " *d_n, "		\
+    "oil_type_" #type " *s1_1, "		\
+    "oil_type_" #type " *s2_1, "		\
+    "double *s3_1, "			\
+    "int n")
+
+/**
+ * oil_cos_f64:
+ * @d_n:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates cos(x) scaled by @s3_1 and places the results 
+ * in @d_n. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+COS_DEFINE_REF(f64);
+
+/**
+ * oil_cos_f32:
+ * @d_n:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates cos(x) scaled by @s3_1 and places the results 
+ * in @d_n. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+COS_DEFINE_REF(f32);
+
+/**
+ * oil_cos_s32:
+ * @d_n:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates cos(x) scaled by @s3_1 and places the results 
+ * in @d_n. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+COS_DEFINE_REF(s32);
+
+/**
+ * oil_cos_s16:
+ * @d_n:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates cos(x) scaled by @s3_1 and places the results 
+ * in @d_n. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+COS_DEFINE_REF(s16);
+
+
+/**
+ * oil_vcocos_f32:
+ * @d_n:
+ * @s1_n:
+ * @s2_1:
+ * @s3_1:
+ * @s4_1:
+ * @n:
+ *
+ * Calculates cos(x) scaled by @s4_1 and places the results 
+ * in @d_n. Values for x start at @s2_1 and are incremented
+ * by same index in @s1_n scaled by @s3_1 for each destination
+ * element.
+ */
+static void vcocos_f32_ref(
+    oil_type_f32 *dest,
+    const oil_type_f32 *input,
+    const oil_type_f32 *offset,
+    const oil_type_f32 *k,
+    const oil_type_f32 *amplitude,
+    int n)
+{
+  int i;
+  double x = *offset;
+
+  for(i=0;i<n;i++){
+    dest[i] = (oil_type_f32) (cos(x) * *amplitude);
+    x += input[i] * *k;
+  }
+}
+OIL_DEFINE_IMPL_REF (vcocos_f32_ref, vcocos_f32);
+OIL_DEFINE_CLASS (vcocos_f32,
+    "oil_type_f32 *d_n, "
+    "oil_type_f32 *s1_n, "
+    "oil_type_f32 *s2_1, "
+    "oil_type_f32 *s3_1, "
+    "oil_type_f32 *s4_1, "
+    "int n");
+
--- /dev/null	2006-03-20 21:40:10.487666816 +0100
+++ liboil/ref/multsum_ns.c	2005-12-20 22:16:53.000000000 +0100
@@ -0,0 +1,120 @@
+/*
+ * LIBOIL - Library of Optimized Inner Loops
+ * Copyright (c) 2003,2004 David A. Schleef <ds at schleef.org>
+ * Copyright (c) 2005 Stephane Fillod <f8cfe at free.fr>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include <liboil/liboilfunction.h>
+#include <liboil/liboilclasses.h>
+#include <math.h>
+
+
+#define MULTSUM_NS_DEFINE_REF(type,sum_type)	\
+static void multsum_ ## type ## _ns_ref(	\
+    oil_type_ ## type *dest,		\
+    oil_type_ ## type *src1,		\
+    oil_type_ ## type *src2,		\
+    int n)				\
+{					\
+  int i;				\
+  sum_type sum = 0;			\
+  sum_type errsum = 0;			\
+  for(i=0;i<n;i++){			\
+    oil_type_ ## type x;		\
+    oil_type_ ## type tmp;		\
+    x = src1[i] * src2[i];		\
+    tmp = sum;				\
+    sum += x;				\
+    errsum += (tmp - sum) + x;		\
+  }					\
+  *dest = sum + errsum;			\
+}					\
+OIL_DEFINE_IMPL_REF (multsum_ ## type ## _ns_ref, multsum_ ## type ## _ns); \
+OIL_DEFINE_CLASS (multsum_ ## type ## _ns, \
+    "oil_type_" #type " *dest, "	\
+    "oil_type_" #type " *src1, "	\
+    "oil_type_" #type " *src2, "	\
+    "int n")
+
+/**
+ * oil_multsum_f32_ns:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+MULTSUM_NS_DEFINE_REF(f32,double);
+/**
+ * oil_multsum_f64_ns:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+MULTSUM_NS_DEFINE_REF(f64,double);
+/**
+ * oil_multsum_u8_ns:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+MULTSUM_NS_DEFINE_REF(u8,int);
+/**
+ * oil_multsum_s16_ns:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+MULTSUM_NS_DEFINE_REF(s16,int32_t);
+/**
+ * oil_multsum_s32_ns:
+ * @dest:
+ * @src1:
+ * @src2:
+ * @n:
+ *
+ * Multiplies each element in @src1 and @src2 and sums the results
+ * over the entire array, and places the sum into @dest.
+ */
+MULTSUM_NS_DEFINE_REF(s32,int64_t);
+
--- /dev/null	2006-03-20 21:40:10.487666816 +0100
+++ liboil/ref/sin.c	2005-12-17 01:05:53.000000000 +0100
@@ -0,0 +1,115 @@
+/*
+ * LIBOIL - Library of Optimized Inner Loops
+ * Copyright (c) 2003,2004 David A. Schleef <ds at schleef.org>
+ * Copyright (c) 2005 Stephane Fillod <f8cfe at free.fr>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include <liboil/liboilfunction.h>
+#include <liboil/liboilclasses.h>
+#include <math.h>
+
+#define SIN_DEFINE_REF(type)	\
+static void sin_ ## type ## _ref(	\
+    oil_type_ ## type *dest,		\
+    const oil_type_ ## type *offset,	\
+    const oil_type_ ## type *interval,	\
+    const double *amplitude,		\
+    int n)				\
+{					\
+  int i;				\
+  for(i=0;i<n;i++){			\
+    oil_type_ ## type x;                    \
+    x = *offset + *interval * i;	\
+    dest[i] = (oil_type_ ## type) (sin(x) * *amplitude);	\
+  }					\
+}					\
+OIL_DEFINE_IMPL_REF (sin_ ## type ## _ref, sin_ ## type); \
+OIL_DEFINE_CLASS (sin_ ## type, \
+    "oil_type_" #type " *d_n, "		\
+    "oil_type_" #type " *s1_1, "		\
+    "oil_type_" #type " *s2_1, "		\
+    "double *s3_1, "			\
+    "int n")
+
+/**
+ * oil_sin_f64:
+ * @d_n:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates sin(x) scaled by @s3_1 and places the results 
+ * in @d_n. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+SIN_DEFINE_REF(f64);
+
+/**
+ * oil_sin_f32:
+ * @d_n:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates sin(x) scaled by @s3_1 and places the results 
+ * in @d_n. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+SIN_DEFINE_REF(f32);
+
+/**
+ * oil_sin_s32:
+ * @d_n:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates sin(x) scaled by @s3_1 and places the results 
+ * in @d_n. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+SIN_DEFINE_REF(s32);
+
+/**
+ * oil_sin_s16:
+ * @d_n:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates sin(x) scaled by @s3_1 and places the results 
+ * in @d_n. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+SIN_DEFINE_REF(s16);
+
--- /dev/null	2006-03-20 21:40:10.487666816 +0100
+++ liboil/ref/sincos_interleaved.c	2005-12-17 01:06:19.000000000 +0100
@@ -0,0 +1,116 @@
+/*
+ * LIBOIL - Library of Optimized Inner Loops
+ * Copyright (c) 2003,2004 David A. Schleef <ds at schleef.org>
+ * Copyright (c) 2005 Stephane Fillod <f8cfe at free.fr>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include <liboil/liboilfunction.h>
+#include <liboil/liboilclasses.h>
+#include <math.h>
+
+#define SINCOS_INTERLEAVED_DEFINE_REF(type)	\
+static void sincos_interleaved_ ## type ## _ref(	\
+    oil_type_ ## type *dest,		\
+    const oil_type_ ## type *offset,	\
+    const oil_type_ ## type *interval,	\
+    const double *amplitude,		\
+    int n)				\
+{					\
+  int i;				\
+  for(i=0;i<n;i++){			\
+    oil_type_ ## type x;                    \
+    x = *offset + *interval * i;	\
+    dest[i*2 + 0] = (oil_type_ ## type) (cos(x) * *amplitude);	\
+    dest[i*2 + 1] = (oil_type_ ## type) (sin(x) * *amplitude);	\
+  }					\
+}					\
+OIL_DEFINE_IMPL_REF (sincos_interleaved_ ## type ## _ref, sincos_interleaved_ ## type); \
+OIL_DEFINE_CLASS (sincos_interleaved_ ## type, \
+    "oil_type_" #type " *d_2xn, "		\
+    "oil_type_" #type " *s1_1, "		\
+    "oil_type_" #type " *s2_1, "		\
+    "double *s3_1, "			\
+    "int n")
+
+/**
+ * oil_sincos_interleaved_f64:
+ * @d_2xn:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates cos(x) and sin(x) scaled by @s3_1 and places the results 
+ * interleaved in @d_2xn. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+SINCOS_INTERLEAVED_DEFINE_REF(f64);
+
+/**
+ * oil_sincos_interleaved_f32:
+ * @d_2xn:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates cos(x) and sin(x) scaled by @s3_1 and places the results 
+ * interleaved in @d_2xn. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+SINCOS_INTERLEAVED_DEFINE_REF(f32);
+
+/**
+ * oil_sincos_interleaved_s32:
+ * @d_2xn:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates cos(x) and sin(x) scaled by @s3_1 and places the results 
+ * interleaved in @d_2xn. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+SINCOS_INTERLEAVED_DEFINE_REF(s32);
+
+/**
+ * oil_sincos_interleaved_s16:
+ * @d_2xn:
+ * @s1_1:
+ * @s2_1:
+ * @s3_1:
+ * @n:
+ *
+ * Calculates cos(x) and sin(x) scaled by @s3_1 and places the results 
+ * interleaved in @d_2xn. Values for x start at @s1_1 and are incremented
+ * by @s2_1 for each couple destination element.
+ */
+SINCOS_INTERLEAVED_DEFINE_REF(s16);
+
--- /dev/null	2006-03-20 21:40:10.487666816 +0100
+++ liboil/ref/sincoscomplexmult.c	2006-03-03 19:11:32.000000000 +0100
@@ -0,0 +1,134 @@
+/*
+ * LIBOIL - Library of Optimized Inner Loops
+ * Copyright (c) 2003,2004 David A. Schleef <ds at schleef.org>
+ * Copyright (c) 2006 Stephane Fillod <f8cfe at free.fr>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include <liboil/liboilfunction.h>
+#include <liboil/liboilclasses.h>
+#include <math.h>
+
+#define SINCOSCOMPLEXMULT_DEFINE_REF(type)	\
+static void sincoscomplexmult_ ## type ## _ref(	\
+    oil_type_ ## type *dest,		\
+    const oil_type_ ## type *src,	\
+    const oil_type_ ## type *offset,	\
+    const oil_type_ ## type *interval,	\
+    const double *amplitude,		\
+    int n)				\
+{					\
+  int i;				\
+  for(i=0;i<n;i++){			\
+    oil_type_ ## type a, b, c, d;	\
+    oil_type_ ## type x, y;		\
+    oil_type_ ## type t;                \
+    t = *offset + *interval * i;	\
+    a = src[2*i + 0];			\
+    b = src[2*i + 1];			\
+    c = (oil_type_ ## type) (cos(t) * *amplitude);	\
+    d = (oil_type_ ## type) (sin(t) * *amplitude);	\
+    x = a * c - b * d;			\
+    y = a * d + b * c;			\
+    dest[i*2 + 0] = x;			\
+    dest[i*2 + 1] = y;			\
+  }					\
+}					\
+OIL_DEFINE_IMPL_REF (sincoscomplexmult_ ## type ## _ref, sincoscomplexmult_ ## type); \
+OIL_DEFINE_CLASS (sincoscomplexmult_ ## type, \
+    "oil_type_" #type " *d_2xn, "		\
+    "oil_type_" #type " *s1_2xn, "		\
+    "oil_type_" #type " *s2_1, "		\
+    "oil_type_" #type " *s3_1, "		\
+    "double *s4_1, "			\
+    "int n")
+
+/**
+ * oil_sincoscomplexmult_f64:
+ * @d_2xn:
+ * @s1_2xn:
+ * @s2_1:
+ * @s3_1:
+ * @s4_1:
+ * @n:
+ *
+ * Calculates cos(x) and sin(x) scaled by @s4_1 and multiply the formed
+ * complex values with the next index of the interleaved complex from @s1_2xn.
+ * Result is stored in interleaved array @d_2xn. Values for x start at @s2_1 
+ * and are incremented by @s3_1 for each couple destination element.
+ */
+SINCOSCOMPLEXMULT_DEFINE_REF(f64);
+
+/**
+ * oil_sincoscomplexmult_f32:
+ * @d_2xn:
+ * @s1_2xn:
+ * @s2_1:
+ * @s3_1:
+ * @s4_1:
+ * @n:
+ *
+ * Calculates cos(x) and sin(x) scaled by @s4_1 and multiply the formed
+ * complex values with the next index of the interleaved complex from @s1_2xn.
+ * Result is stored in interleaved array @d_2xn. Values for x start at @s2_1 
+ * and are incremented by @s3_1 for each couple destination element.
+ */
+SINCOSCOMPLEXMULT_DEFINE_REF(f32);
+
+/**
+ * oil_sincoscomplexmult_s32:
+ * @d_2xn:
+ * @s1_2xn:
+ * @s2_1:
+ * @s3_1:
+ * @s4_1:
+ * @n:
+ *
+ * Calculates cos(x) and sin(x) scaled by @s4_1 and multiply the formed
+ * complex values with the next index of the interleaved complex from @s1_2xn.
+ * Result is stored in interleaved array @d_2xn. Values for x start at @s2_1 
+ * and are incremented by @s3_1 for each couple destination element.
+ */
+SINCOSCOMPLEXMULT_DEFINE_REF(s32);
+
+/**
+ * oil_sincoscomplexmult_s16:
+ * @d_2xn:
+ * @s1_2xn:
+ * @s2_1:
+ * @s3_1:
+ * @s4_1:
+ * @n:
+ *
+ * Calculates cos(x) and sin(x) scaled by @s4_1 and multiply the formed
+ * complex values with the next index of the interleaved complex from @s1_2xn.
+ * Result is stored in interleaved array @d_2xn. Values for x start at @s2_1 
+ * and are incremented by @s3_1 for each couple destination element.
+ */
+SINCOSCOMPLEXMULT_DEFINE_REF(s16);
+



More information about the Liboil mailing list