[Mesa-dev] [PATCH 03/10] i965/fs: Add vector float immediate infrastructure.

Fri Oct 24 11:40:41 PDT 2014

Matt Turner <mattst88 at gmail.com> writes:

> On Fri, Oct 24, 2014 at 10:47 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>> Matt Turner <mattst88 at gmail.com> writes:
>>
>>> On Fri, Oct 24, 2014 at 1:34 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>>> Matt Turner <mattst88 at gmail.com> writes:
>>>>
>>>>> ---
>>>>>  src/mesa/drivers/dri/i965/brw_fs.cpp           | 20 ++++++++++++++++++++
>>>>>  src/mesa/drivers/dri/i965/brw_fs.h             |  1 +
>>>>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  3 +++
>>>>>  3 files changed, 24 insertions(+)
>>>>>
>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>>>> index 983e8db..ffe0a5b 100644
>>>>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>>>> @@ -44,6 +44,7 @@ extern "C" {
>>>>>  #include "brw_context.h"
>>>>>  #include "brw_eu.h"
>>>>>  #include "brw_wm.h"
>>>>> +#include "brw_packed_float.h"
>>>>>  }
>>>>>  #include "brw_fs.h"
>>>>>  #include "brw_cfg.h"
>>>>> @@ -581,6 +582,18 @@ fs_reg::fs_reg(uint32_t u)
>>>>>     this->width = 1;
>>>>>  }
>>>>>
>>>>> +/** Vector float immediate value constructor. */
>>>>> +fs_reg::fs_reg(uint8_t vf[4])
>>>>> +{
>>>>> +   init();
>>>>> +   this->file = IMM;
>>>>> +   this->type = BRW_REGISTER_TYPE_VF;
>>>>> +   this->fixed_hw_reg.dw1.ud = (vf[0] <<  0) |
>>>>> +                               (vf[1] <<  8) |
>>>>> +                               (vf[2] << 16) |
>>>>> +                               (vf[3] << 24);
>>>>> +}
>>>>> +
>>>>
>>>> IMHO it would be a lot more convenient to have this function take four
>>>> float arguments and carry out the conversion to 8-bit here.  I suspect
>>>> that in most cases the arguments of this constructor (and the same goes
>>>> for src_reg) will be known at compile time so we could just add an
>>>> assertion to check that the conversion was carried out exactly.
>>>
>>> That would work if we're constructing VF immediates from known good
>>> values, but in cases like in an optimization, you're always going to
>>> have to check whether the values are representable in VF. If they're
>>> not, the pass will have to fail.
>>>
>>
>> Yeah, but isn't this interface unnecessarily annoying to use in the
>> cases that fit the "known good values" category? (all uses in your
>> series and in my previous code that makes use of VF) -- It led you write
>> code like this:
>>
>> | +   uint8_t vf[4] = {0x0, 0x60, 0x70, 0x78};
>> |[...]
>> | +   emit(MOV(shift, src_reg(vf)));
>>
>> ...which is pretty hard to decode.  Don't you find this much easier to
>> understand? (even if it means we may have to introduce a second
>> constructor at some point in the future)
>>
>> |   emit(MOV(shift, src_reg(0, 8, 16, 24)));
>
> There are two separate changes you're suggesting -- pass 4 arguments
> rather than an array, and pass integer/float literals rather than VFs.
> The first suggestion seems fine to me. I agree that passing float
> literals and having the constructor do the conversion reads nicer. I'm
> not convinced that it very important though. Just above the first
> line, there's a comment that explains what's going on:
>
>    /* Instead of splitting the 32-bit integer, shifting, and ORing it back
>     * together, we can shift it by <0, 8, 16, 24>. The packed integer immediate
>     * is not suitable to generate the shift values, but we can use the packed
>     * vector float and a type-converting MOV.
>     */
>    uint8_t vf[4] = {0x0, 0x60, 0x70, 0x78};
>
> I might suggest using some defines (i.e., VF_0, VF_8, ...) except that
> I know you actually replaced some code that did that with calls to
> int_to_float8 that are evaluated at compile time, potentially
> recursively (and those were just VF_ZERO/VF_ONE arguments).
>
> Yeah, we could add another constructor that converts arguments for us,
> but it doesn't really seem that important to me...?
>
> What would it do when the things you pass it aren't representable?

I'd vote for adding an 'assert(brw_float_to_vf(arg0) >= 0 && ...)' in
the src_reg constructor.

> Now we're trading one piece of a priori information for another.

The difference is that if we handle the conversion algorithmically and
something goes wrong you get an assertion error with a clear indication
of what went wrong.  If the conversion has to be handled by the user the
likelihood increases that it will be done incorrectly and at the same
time you have to pay a higher price if you mess up because now you get
misrendering instead of an assertion failure which is almost always
more difficult to debug.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20141024/dad0eb0c/attachment.sig>