[Mesa-dev] [PATCH 2/5] i965/fs: Lower 32x32 bit multiplication on BXT.

Francisco Jerez currojerez at riseup.net
Tue Aug 11 02:58:37 PDT 2015


Neil Roberts <neil at linux.intel.com> writes:

> Are you sure this patch is necessary? The documentation for the multiply
> instruction on BDW+ says:
>
> SourceType : 	*D 	
> DestinationType : 	*D 	
> Project : 	EXCLUDE(CHV) 
>
> This to me implies that it should work on BXT because it doesn't say
> EXCLUDE(BXT).
>

In fact both CHV and BXT *can* support 32x32 multiplication, that's not
really the reason that motivated this patch -- The problem is that 32x32
multiplication has QWORD execution type which has several restrictions
on CHV and BXT, the annoying one is:

"CHV, BXT

When source or destination datatype is 64b or operation is integer DWord
multiply, regioning in Align1 must follow these rules:

    Source and Destination horizontal stride must be aligned to the same
    qword.

    Example:
[...]
    // mul (4) r10.0<2>:d r11.0<8;4,2>:d r12.0<8;4,2>:d // Source and Destination stride must be 2 since the execution type is Qword."

So it sounds like we could use the native 32x32 multiply with some
additional effort to pass the arguments with the correct stride, but
it's not clear to me whether the solution would be any more better than
splitting up the multiply into 32x16 multiplies, so doing the same as in
CHV and pre-Gen8 seemed like the most KISS solution for now.

> I made a little test case and ran it on my BXT and it seems to work even
> without this patch. I looked at the assembler source and it is
> definitely doing a MUL instruction with D types for the dst and two
> sources.
>
> https://github.com/bpeel/piglit/blob/test-integer-multiply/tests/general/mult32.c
>
Hmm, I guess the docs could be wrong, but I'm not sure what the
consequences would be of violating the alignment rule, I guess the
failure may be non-deterministic.  What stepping of BXT did you test
this on?

> Regards,
> - Neil
>
> Francisco Jerez <currojerez at riseup.net> writes:
>
>> AFAIK BXT has the same annoying alignment limitation as CHV on the
>> source register regions of 32x32 bit MULs, give it the same treatment.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 244f299..fc9f007 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -3126,9 +3126,9 @@ fs_visitor::lower_integer_multiplication()
>>     bool progress = false;
>>  
>>     /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit operation
>> -    * directly, but Cherryview cannot.
>> +    * directly, but CHV/BXT cannot.
>>      */
>> -   if (devinfo->gen >= 8 && !devinfo->is_cherryview)
>> +   if (devinfo->gen >= 8 && !devinfo->is_cherryview && !devinfo->is_broxton)
>>        return false;
>>  
>>     foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
>> -- 
>> 2.4.6
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150811/3c046541/attachment.sig>


More information about the mesa-dev mailing list