[Mesa-dev] [PATCH 1/2] i965/vec4: Lower integer multiplication after optimizations.
Francisco Jerez
currojerez at riseup.net
Thu Apr 28 23:07:33 UTC 2016
Matt Turner <mattst88 at gmail.com> writes:
> On Mon, Apr 18, 2016 at 5:18 PM, Matt Turner <mattst88 at gmail.com> wrote:
>> On Mon, Apr 18, 2016 at 5:08 PM, Ian Romanick <idr at freedesktop.org> wrote:
>>> On 04/18/2016 04:14 PM, Matt Turner wrote:
>>>> Analogous to commit 1e4e17fbd in the i965/fs backend.
>>>>
>>>> Because the copy propagation pass in the vec4 backend is strictly local,
>>>> we look at the immediate values coming from NIR and emit the multiplies
>>>> we need directly. If the copy propagation pass becomes smarter in the
>>>> future, we can reduce the nir_op_imul case in brw_vec4_nir.cpp to a
>>>> single multiply.
>>>>
>>>> total instructions in shared programs: 7082311 -> 7081953 (-0.01%)
>>>> instructions in affected programs: 59581 -> 59223 (-0.60%)
>>>> helped: 293
>>>>
>>>> total cycles in shared programs: 65765712 -> 65764796 (-0.00%)
>>>> cycles in affected programs: 854112 -> 853196 (-0.11%)
>>>> helped: 154
>>>> HURT: 73
>>>> ---
>>>> src/mesa/drivers/dri/i965/brw_vec4.cpp | 67 ++++++++++++++++++++++++++++++
>>>> src/mesa/drivers/dri/i965/brw_vec4.h | 1 +
>>>> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 48 +++++++++------------
>>>> 3 files changed, 88 insertions(+), 28 deletions(-)
>>>>
>>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>>>> index b9cf3f6..1644d4d 100644
>>>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>>>> @@ -1671,6 +1671,71 @@ vec4_visitor::lower_minmax()
>>>> return progress;
>>>> }
>>>>
>>>> +bool
>>>> +vec4_visitor::lower_integer_multiplication()
>>>> +{
>>>> + bool progress = false;
>>>> +
>>>> + foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
>>>> + const vec4_builder ibld(this, block, inst);
>>>> +
>>>> + if (inst->opcode == BRW_OPCODE_MUL) {
>>>> + if (inst->dst.is_accumulator() ||
>>>> + (inst->src[1].type != BRW_REGISTER_TYPE_D &&
>>>> + inst->src[1].type != BRW_REGISTER_TYPE_UD))
>>>> + continue;
>>>> +
>>>> + /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit
>>>> + * operation directly, but CHV/BXT cannot.
>>>> + */
>>>> + if (devinfo->gen >= 8 &&
>>>> + !devinfo->is_cherryview && !devinfo->is_broxton)
>>>> + continue;
>>>
>>> Shouldn't this whole method just bail if we're Gen >= 8 and !CHV and
>>> !BXT? Or does this structure simplify future changes?
>>
>> Oh, I hadn't noticed.
>>
>> The FS code was originally as you suggest, with the function returning
>> early under those conditions. Curro changed that in commit 2e731264382
>> in order to add lowering support for the multiply-high instruction on
>> all platforms. We may want to do that in the vec4 backend as well.
>>
>> The other thing I need to fix is Cherryview multiplications, where we
>> need to change the type of src1 to UW. I'm not sure if it's better to
>> do that here, or at a lower level. Maybe in brw_MUL itself since
>> that's called in a few places...
>>
>> Depending on whether people think that code should go here or
>> elsewhere, I'll move the block to the beginning of the function.
>
> Ken, Curro: any opinion where we should change src1's type to UW on CHV/BXT?
I guess it would make sense to change the types as soon as the multiply
by UD is lowered to multiplies by UW? It may still make sense to assert
in the generator that the types coming in are supported by the hardware.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160428/52600c2b/attachment.sig>
More information about the mesa-dev
mailing list