[Mesa-dev] RFC: tgsi opcodes for 32x32 muls with 64bit results
Marek Olšák
maraeo at gmail.com
Fri May 3 03:31:58 PDT 2013
FWIW, this maps nicely to r600, which also has separate instructions
for the low and high 32 bits. As to what option is better, it really
depends on whether shading languages and OpenCL expose the
instructions directly through functions, or whether they just have
64-bit integers.
Marek
On Fri, May 3, 2013 at 1:29 AM, Roland Scheidegger <sroland at vmware.com> wrote:
> Currently, there's no way to get the high bits of a 32x32
> signed/unsigned integer multiplication with tgsi.
> However, all of d3d10, OpenGL, and OpenCL support that, so we need it as
> well.
> There's essentially two ways how it could be done:
> - a 2-destination instruction returning both high and low bits (this is
> how it looks like in d3d10 and glsl)
> - use the existing umul for the low bits and have another instruction
> for the high bits (this is how it looks like in opencl)
>
> Well there's other possibilities but these looked like they'd match both
> APIs and HW reasonably (well with the exception of things like sse2
> which would prefer 2x2 32bit inputs and return 2x64bit as one reg...).
>
> Actually it's two new instructions because unlike for the low bits it
> matters for the high bits if the source operands are signed or unsigned.
>
> Personally I'm favoring two separate instructions for low and high bits
> to not have to deal with multi-destination instructions, but if someone
> makes a strong case for one returning both low and high bits I could be
> convinced otherwise. I think though two instructions matches most hw
> very well (with the exception of software renderers and possibly intel
> graphics but then a good backend could certainly recognize this).
>
> So here's what the docs would say about these instructions:
>
>
> .. opcode:: IMUL_HI - Signed Integer Multiply High Bits
>
> The high 32bits of the multiplication of 2 signed integers is returned.
>
> .. math::
>
> dst.x = src0.x \times src1.x >> 32
>
> dst.y = src0.y \times src1.y >> 32
>
> dst.z = src0.z \times src1.z >> 32
>
> dst.w = src0.w \times src1.w >> 32
>
>
> .. opcode:: UMUL_HI - Unsigned Integer Multiply High Bits
>
> The high 32bits of the multiplication of 2 unsigned integers is returned.
>
> .. math::
>
> dst.x = src0.x \times src1.x >> 32
>
> dst.y = src0.y \times src1.y >> 32
>
> dst.z = src0.z \times src1.z >> 32
>
> dst.w = src0.w \times src1.w >> 32
>
>
> Roland
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list