[Mesa-dev] [PATCH 1/2] nir: add nir_opt_move_load_ubo() optimization pass

Tue Mar 13 13:07:31 UTC 2018

On 03/12/2018 05:21 PM, Marek Olšák wrote:
> On Mon, Mar 12, 2018 at 6:18 AM, Samuel Pitoiset
> <samuel.pitoiset at gmail.com> wrote:
>>
>>
>> On 03/11/2018 04:41 PM, Marek Olšák wrote:
>>>
>>> On Thu, Mar 8, 2018 at 5:48 PM, Ian Romanick <idr at freedesktop.org> wrote:
>>>>
>>>> On 03/08/2018 06:50 AM, Samuel Pitoiset wrote:
>>>>>
>>>>> This pass moves load UBO operations just before their first use,
>>>>> loosely based on nir_opt_move_comparisons.
>>>>
>>>>
>>>> If I'm reading this correctly, it moves UBO loads closer to the first
>>>> use in the same block.  My assumption is the benefit in the next patch
>>>> occurs because live ranges are smaller.  It seems like this could also
>>>> hurt performance since it may be harder for the schedule to hide the
>>>> latency of the load when register pressure is not an issue.  Have you
>>>> measured performance of running apps to see if this is an issue?
>>>>
>>>> I'm mostly asking because Jason had a series for global code motion that
>>>> does, in some cases, the opposite of this patch by moving UBO loads up
>>>> to earlier blocks.
>>>
>>>
>>> The pass is OK for LLVM, because LLVM does CSE across basic blocks,
>>> and it also does instruction scheduling within a block.
>>>
>>> radeonsi/tgsi does the same thing: it load uniforms from memory at
>>> every use. It sounds inefficient, but we found out that it's the best
>>> thing to do with LLVM. LLVM can move loads away from the use, but it
>>> doesn't move loads close to the use.
>>
>>
>> Exactly, RadeonSI does something similar. Though the shader-db result posted
>> by Timothy doesn't look very good for NIR, what do you think?
>>
> 
> The results are OK.

Okay, can someone review the series then?

> 
> Marek
>