[Mesa-dev] IROUND, math errors, etc. (was: Re: proposed patch optimized F_TO_I for powerpc platforms)

Fri Jul 31 09:44:32 PDT 2015

On Fri, Jul 31, 2015 at 7:13 AM, Roland Scheidegger <sroland at vmware.com> wrote:
> CC mesa-dev.
>
> This looks good to me. I am starting to wonder though why we don't just
> use lrintf() and let the compiler sort it out (for x86 too).
> Though actually some quick experiments show that:
> - llvm's clang will always use libm lrintf call. Which then will do
> (x86_64) cvtss2si %xmm0,%rax as expected. Meaning the cost is probably
> twice as high as it could be due to the unnecessary library call.
> - gcc will also use the same library call. Unless you specify
> -fno-math-errno (or some more aggressive math optimizing stuff), in
> which case it will do the cvtss2si on its own. Which is fairly stupid,
> because this function doesn't set errno in any case, so it could be used
> independent of -fno-math-errno.
>
> Speaking of -fno-math-errno, why don't we use that in mesa? I know the
> fast math stuff can be problematic, but noone is _ever_ interested in
> math error numbers.
>
> Speaking of which, I'm not really sure why IROUND isn't doing the same.
> Yes it rounds away from zero, but I doubt that matters - would probably
> be better to match whatever rounding is used in hw (GL doesn't seem to
> specify tie-breaker rules for round to nearest afaict).
>
> FWIW IROUND along with even the 64bit sibling IROUND64 (and IROUND_POS)
> is not even really correct in any case. There exist floats where f +
> 0.5f will round up to the next integer incorrectly. e.g. something like
> "largest float smaller than 63.5f", 63.4999999f or so, if you add +0.5f
> the resulting number for the hw is right between that largest float
> smaller than 63.5f and 64.0f, and thus it will use the tie-breaker rule
> (round to nearest even for your typical hw with typical rounding mode
> set) making this 64.0, thus the rounded integer will be 64, which is
> just plain wrong no matter the round-to-nearest tie breaker rule.
> There are ways to fix it (the obvious one is to add 0.5 as double), but
> I don't think we should even try that, and assume lrintf can do a decent
> job on hw we care about (compiler not doing its job right is a pity but
> might not be too bad even if it uses lib call).

I've actually got a branch to get rid of F_TO_I (and I want to remove
IROUND as well) in favor of libm rounding functions.

I agree that we don't care about errno and traps and such, so I tried
a few things to get the code we want from rintf, etc. I tried marking
a wrapper around rintf with __attribute__((optimize("-ffast-math")))
but just today a gcc developer confirmed that this cannot work because
when the function is inlined it loses the optimization attribute. I'll
do some tests with -fno-math-errno and friends.

I'll finish this branch up very soon.