[pulseaudio-discuss] [PATCH 1/2] sconv: Change/fix conversion to/from float32

Sun Feb 3 16:20:36 PST 2013

Hello Tanu,

> On Sun, 2013-01-13 at 20:59 +0200, Tanu Kaskinen wrote:
> > On Sun, 2013-01-13 at 14:53 +0100, Peter Meerwald wrote:
> > > > > diff --git a/src/pulsecore/sconv_neon.c b/src/pulsecore/sconv_neon.c
> > > > > index 6fd966d..111b56f 100644
> > > > > --- a/src/pulsecore/sconv_neon.c
> > > > > +++ b/src/pulsecore/sconv_neon.c
> > > > > @@ -36,16 +36,11 @@ static void pa_sconv_s16le_from_f32ne_neon(unsigned n, const float *src, int16_t
> > > > >          "movs       %[n], %[n], lsr #2      \n\t"
> > > > >          "beq        2f                      \n\t"
> > > > >  
> > > > > -        "vdup.f32   q2, %[plusone]          \n\t"
> > > > > -        "vneg.f32   q3, q2                  \n\t"
> > > > > -        "vdup.f32   q4, %[scale]            \n\t"
> > > > > -        "vdup.u32   q5, %[mask]             \n\t"
> > > > > +        "vdup.f32   q1, %[scale]            \n\t"
> > > > >  
> > > > >          "1:                                 \n\t"
> > > > >          "vld1.32    {q0}, [%[src]]!         \n\t"
> > > > > -        "vmin.f32   q0, q0, q2              \n\t" /* clamp */
> > > > > -        "vmax.f32   q0, q0, q3              \n\t"
> > > > > -        "vmul.f32   q0, q0, q4              \n\t" /* scale */
> > > > > +        "vmul.f32   q0, q0, q1              \n\t" /* scale */
> > > > >          "vcvt.s32.f32 q0, q0, #16           \n\t" /* narrow */
> > >  
> > > > You removed clamping - what happens if there's need for clamping? (I'm
> > > > not very good at reading assembly.)
> > > 
> > > vrshrn does the narrowing int32->int16 (with saturation); the comment 
> > > should be moved one line down
> > 
> > The vcvt instruction converts floating-point numbers to fixed-point
> > numbers, with 16 bits in the integer part and 16 bits in the fractional
> > part, so most of the interesting stuff happens already in vcvt. How does
> > vcvt handle the situation where the float doesn't fit in the 16 bits
> > that are reserved for the integer part? Saturation or SIGFPE, or
> > something else? How is NaN handled? The reference[1] that I'm using
> > doesn't say anything about this...
> > 
> > You say that vrshrn does its thing with saturation. Since the integer
> > part of the fixed-point input is already 16-bits, there's not much need
> > for saturation. Only the rounding the fractional part can cause
> > overflow, so do you mean that if the rounding would cause overflow,
> > vrshrn uses truncation instead of rounding? (This is not specified in
> > the reference either.)
> > 
> > [1] http://infocenter.arm.com/help/topic/com.arm.doc.dui0204j/CIHFFGJG.html

> You never answered these questions, and the new patch version contains
> the same code. "vcvt.s32.f32 q0, q0, #16" converts four floats into four
> 16.16 fixed-point numbers. What happens if the input is greater than
> INT16_MAX?

here is some more detail:

vcvt.s32.f32 q0, q0, #16 
does saturation (this is indeed not documented), so we have 16 bit 
integer and 16 bit fractional

the following
vrshrn.s32 d0, q0, #16
shifts 16 bits to the right and rounds according to the shifted-out 
fractional part (but does NOT saturate); this is an error, the correct
instruction is 
vqrshrn.s32 d0, q0, #16
which does saturation and rounding

I'll post a v3 

the test code below converts several values:

#include <stdlib.h>
#include <stdio.h>
#include <math.h>

#ifdef __arm__
#include "arm_neon.h"
#else
#include "xmmintrin.h"
#endif

# on ARM NEON
0.500 0 -- 00008000 1
-0.500 0 -- ffff8000 0
0.300 0 -- 00004ccc 0
0.600 1 -- 00009999 1
2.500 2 -- 00028000 3
3.500 4 -- 00038000 4
32000.500 32000 -- 7d008000 32001
33000.500 33000 -- 7fffffff 32767
-33000.500 -33000 -- 80000000 -32768
32767.500 32768 -- 7fff8000 32767

all values look reasonable; note that resuls are slightly different 
compared to lrintf() or SSE due to different rounding:
NEON always rounds up on 0.5, lrintf() round toward the nearest even 
integer -- so there is a maximum deviation of 1 in some rare cases

int main() {
        float values[] = {0.5, -0.5, 0.3, 0.6, 2.5, 3.5, 32000.5, 33000.5, -33000.5, 32767.5};
        int i;

        for (i = 0; i < sizeof(values)/sizeof(float); i++) {
                float f = values[i];
                printf("%.3f %ld -- ", f, lrintf(f));
#ifdef __arm__
                float32x4_t x = vdupq_n_f32(f);
                int32x4_t y = vcvtq_n_s32_f32(x, 16);
                int16x4_t z = vqrshrn_n_s32(y, 16);
                printf("%08x %d\n",
                        vgetq_lane_s32(y, 0),
                        vget_lane_s16(z, 0));
#else
                __m128 x = _mm_set_ss(f);
                printf("%d\n", _mm_cvt_ss2si(x));
#endif
        }

        return EXIT_SUCCESS;
}

thanks, regards, p.

-- 

Peter Meerwald
+43-664-2444418 (mobile)