[Mesa-dev] [Freedreno] [RFC 0/4] freedreno: Move some compiler offset computations to NIR

Eduardo Lima Mitev elima at igalia.com
Mon Jan 28 08:32:01 UTC 2019


On 1/26/19 12:42 AM, Rob Clark wrote:
> On Fri, Jan 25, 2019 at 10:48 AM Eduardo Lima Mitev <elima at igalia.com> wrote:
>>
>> There are a bunch of instructions emitted on ir3_compiler_nir related to
>> offset computations for IO opcodes (ssbo, image, etc). This small series
>> explores the possibility of moving these instructions to NIR, where we
>> have higher chances of optimizing them.
>>
>> The series introduces a new, freedreno specific NIR pass,
>> 'ir3_nir_lower_sampler_io' (final name not set). The pass is executed
>> early on ir3_optimize_nir(), and the goal is to centralize all these
>> computations there, hoping that later NIR passes will produce better
>> code than what is currently emitted.
> 
> I can think of a few other things that would be interesting to lower
> to driver specific nir opcodes (imul and various lowering for tex
> instructions come to mind.. but probably also ubo and ssbo address
> calculation.. maybe it could even make sense for some of the single
> src alu instructions that translate into multiple ir3 instructions,
> not sure)..
> 

Yes, the plan is to abstract to NIR whatever brings us a benefit in
instruction stats. There is also the question of just simplifying the
backend compiler, provided we don't hurt produced code.

> Are you thinking about having separate passes for each?  I guess at
> least for alu instructions we might able to use nir_algebraic so
> having things split up might be easier.
> 

I haven't thought too much about this yet, but it seems to make sense
having at least 2 passes, one for I/O and one for ALUs.

>> So far, we have just implemented byte-offset computation for image store
>> and atomics. This seemed like a good first target given the amount of
>> instructions being emitted for it by the backend.
>>
>> This is an RFC series because there are a few open questions, but we
>> wanted to gather feedback already now, in case this effort is something
>> not worth it; and also hoping that somebody else will give it a try
>> against other shaders and on other gens (we have just tried this on
>> a5xx).
>>
>> * We have so far been unable to see any improvement in generated code
>> (not a penalty either). shader-db has not been specially useful. Few
>> shaders there exercise image store or image atomic ops, and of those
>> that do, most require higher versions of GLSL than what freedreno
>> supports, so they get skipped. The few that do actually run, don't
>> show any meaningful difference.
> 
> I guess it would be easy enough to construct shaders that would
> benefit from this, but maybe that is cheating..
> 
> Possibly UBO and SSBO is a better target, I guess there you might be
> more likely to see patterns of access of successive elements (ie.
> foo[idx], foo[idx+1], etc)?
> 

I took a first stab at SSBO's load/store/atomic, where the offset is
divided by 4 in the backend, but was bitten by IR3_STGB requiring both
the original byte-offset and the dword-offset (in src1 and src2
respectively). So trivially emitting a nir_shr on the offset didn't buy
us anything. I have in the backlog to revisit this, turning the offset
into a 2-component reg so we can hold the original byte-offset and the
offset divided by 4.

> Anyways, since we don't try to do (and I'd rather not do) any sort of
> CSE post nir->ir3 I think starting to introduce more ir3 specific
> nir->nir lowering seems like a thing we need, so I'm pretty happy that
> someone is looking at this :-)
> 

Thanks, that's encouraging. Lets see how far we can get :).

Eduardo

> BR,
> -R
> 
>> Then other shaders picked from tests suites are simple enough not to
>> produce any difference in code either.
>>
>> There is still on-going work looking for cases where the pass helps
>> instruction stats, whether writing custom shaders or porting complex
>> shader from shader-db to run on GLES 310.
>>
>> There is though an open question here as to whether moving backend
>> code to NIR is a benefit in and of itself.
>>
>> * The series adds a nir_op_imad opcode that didn't exist before, and
>> perhaps not generally useful even for freedreno outside this pass,
>> because it maps to IR3_MAD_S24 which is probably not suitable for
>> generic integer multiply-add.
>>
>> * The pass currently has 2 alternative code-paths to emit the
>> multiplication by the bytes-per-pixel of an image format. In one
>> case, since this value can be obtained at compile time, it is
>> emitted as an immediate by nir_imul_imm. The other alternative is
>> emitting an nir_imul with an SSA value that will map to
>> image_dims[0] at shader runtime.
>>
>> The former case is uglier but produces better code (a single SHL
>> instruction), whereas the latter involves a generic imul, for which
>> the backend emits a lot of code to cover for overflow.
>>
>> The doubt here is whether we should introduce a (lower precision)
>> version of imul that maps directly to IR3_IMUL_S.
>>
>>
>> A live (WIP) tree of the series can be found at:
>> <https://gitlab.freedesktop.org/elima/mesa/commits/wip/fd-compiler-io>
>>
>> We plan to continue moving computations to the pass if we see
>> good opportunities.
>>
>> Feedback very welcome,
>>
>> cheers,
>> Eduardo
>>
>> Eduardo Lima Mitev (4):
>>   nir: Add a new intrinsic 'load_image_stride'
>>   nir: Add a new ALU nir_op_imad
>>   ir3/nir: Add a new pass 'ir3_nir_lower_sampler_io'
>>   ir3: Use ir3_nir_lower_sampler_io pass
>>
>>  src/compiler/nir/nir_intrinsics.py           |   2 +
>>  src/compiler/nir/nir_opcodes.py              |   1 +
>>  src/freedreno/Makefile.sources               |   1 +
>>  src/freedreno/ir3/ir3_compiler_nir.c         |  61 ++--
>>  src/freedreno/ir3/ir3_nir.c                  |   1 +
>>  src/freedreno/ir3/ir3_nir.h                  |   1 +
>>  src/freedreno/ir3/ir3_nir_lower_sampler_io.c | 349 +++++++++++++++++++
>>  7 files changed, 383 insertions(+), 33 deletions(-)
>>  create mode 100644 src/freedreno/ir3/ir3_nir_lower_sampler_io.c
>>
>> --
>> 2.20.1
>>
>> _______________________________________________
>> Freedreno mailing list
>> Freedreno at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/freedreno
> _______________________________________________
> Freedreno mailing list
> Freedreno at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno
> 


More information about the mesa-dev mailing list