[Mesa-dev] [PATCH 04/15] i965/fs: Get rid of reladdr

Francisco Jerez currojerez at riseup.net
Thu Dec 17 09:04:42 PST 2015


Jason Ekstrand <jason at jlekstrand.net> writes:

> On Dec 14, 2015 6:24 AM, "Francisco Jerez" <currojerez at riseup.net> wrote:
>>
>> Jason Ekstrand <jason at jlekstrand.net> writes:
>>
>> > On Dec 11, 2015 5:44 AM, "Francisco Jerez" <currojerez at riseup.net>
> wrote:
>> >>
>> >> Jason Ekstrand <jason at jlekstrand.net> writes:
>> >>
>> >> > On Dec 10, 2015 6:58 AM, "Francisco Jerez" <currojerez at riseup.net>
>> > wrote:
>> >> >>
>> >> >> Jason Ekstrand <jason at jlekstrand.net> writes:
>> >> >>
>> >> >> > We aren't using it anymore.
>> >> >>
>> >> >> It seems useful to me to be able to represent indirect access as
> part
>> > of
>> >> >> any instruction source or destination register.
>> >> >>
>> >> >> The following:
>> >> >>
>> >> >> | mov_indirect g0, g1, a0
>> >> >> | foo g2, g0
>> >> >>
>> >> >> and the converse case with indirect destination offset (which you
> don't
>> >> >> seem to represent currently) can be implemented by the hardware more
>> >> >> efficiently using a single instruction in certain cases.  The
> current
>> > IR
>> >> >> is able to represent what the hardware can do, but supporting the
>> >> >> MOV_INDIRECT instruction only would force us to keep the indirection
>> >> >> separate from the instruction that uses it, so it seems like a less
>> >> >> expressive representation to me than the current approach, unless
>> > you're
>> >> >> willing to add _INDIRECT variants of most hardware opcodes.
>> >> >
>> >> > Yes and, mostly, no.  Yes, you can put an indirect on almost anything
>> > but
>> >> > it has substantial restrictions:
>> >> >
>> >> Yes, I'm aware of the restrictions of indirect addressing on Gen
>> >> hardware.  The fact that reladdr can represent addressing modes which
>> >> aren't directly implemented in the hardware doesn't invalidate the
>> >> abstraction, nor implies that optimization and translation passes must
>> >> be made aware of such restrictions.  It just means that our abstraction
>> >> is a superset of the hardware indirect addressing scheme, which is fine
>> >> given a lowering pass to convert indirect addressing into a legal form.
>> >>
>> >> Your proposal instead goes the opposite direction and replaces the
>> >> preexisting abstraction with a new abstraction which can only represent
>> >> a tiny subset of the functionality of hardware instructions, which I
>> >> don't think is acceptable for a low-level IR.
>> >>
>> >> > 1) Destination indirects must be uniform (I'm only 95% sure this is
> the
>> >> > case)
>> >> >
>> >> > 2) We only have 8 address subregisters on HSW and prior and 16 on BDW
>> > which
>> >> > means:
>> >> >
>> >>
>> >> That's the kind of thing that SIMD lowering would be able to take care
>> >> of easily.
>> >
>> > Yes, and we use it for MOV_INDIRECT.
>> >
>> Right, and there's no reason why it couldn't be used for reladdr in the
>> same way to handle the above restrictions.
>>
>> >> >     a) All indirects must be uniform OR
>> >> >
>> >> >     b) We must be on Broadwell, have only two indirects, and split
> the
>> >> > instruction OR
>> >> >
>> >> >     c) We can only have one indirect (and still split the
> instruction on
>> >> > HSW)
>> >> >
>> >> > So, yes, "everthing can be uniform", but not in real life.
>> >>
>> >> > The reladdr abstraction, while maybe ok for a high-level IR, doesn't
>> >> > accurately represent the hardware at all because it can't represent
>> >> > any of the restrictions without piles of helper functions and checks.
>> >>
>> >> The reladdr abstraction can represent the full flexibility of the
>> >> hardware, while MOV_INDIRECT cannot, that makes reladdr a more accurate
>> >> low-level representation of the program.  For that reason I'd argue the
>> >> exact opposite: MOV_INDIRECT would be a great abstraction and a welcome
>> >> simplification for a high-level IR (e.g. NIR), in which nothing is lost
>> >> by sacrificing the expressiveness of individual instructions in favour
>> >> of instruction composition, because the back-end can always combine
>> >> multiple high-level instructions into a single low-level instruction
>> >> where the hardware ISA allows, as long as the low-level IR is able to
>> >> represent the full functionality of hardware instructions (IOW the
>> >> mapping between low-level IR instructions and hardware instructions is
>> >> surjective), which is further from being the case after this series.
>> >
>> > Flexibility and expressiveness isn't always a good thing.
>>
>> Expressiveness is a good thing as long as it's required to represent the
>> capabilities of the hardware.
>
> As long as it's required to represent those capabilities that we care
> about, yes.  However, the hardware can do many things that either aren't
> useful for us or take so much effort for the little benefit you gain, that
> it's not worth it.  I'm arguing that saving a single MOV in the few places
> that we do in directs isn't worth the complexity.
>
>> > Yes, reladdr is more "expressive", but it's not getting used because
>> > that expressiveness makes it a pain.
>>
>> AFAIK nobody has bothered to implement it completely down to the
>> back-end generator so it's not a wonder that it's not extensively used.
>> I'd like to see some evidence that it would actually be a pain
>> (e.g. some proof of concept implementation) before we give up and commit
>> to an inherently more limited model.  If you already have such a proof
>> of concept, please share it.  If you don't and are fully convinced that
>> it will be a pain I suggest you hold up this series for a while and let
>> me write the proof of concept -- Doing something with the expectation
>> that it's going to be a pain is the best way to make sure it actually
>> becomes a pain.  Worst-case scenario you're right and it does become a
>> pain, at which point I come back and review this series.  Best-case
>> scenario you're wrong and we don't give up useful functionality of the
>> IR based on an expectation.
>
> Do I have a proof-of-concept in code, no.  However, I've run through it in
> my head and I have a pretty good idea what it would look like.

Ah, that's what I guessed.  I don't think that your previous claim that
supporting indirect addressing based on the current reladdr approach
would require rewriting optimization passes to be recursive demonstrates
a particularly clear notion of what it could look like.

> You are free to go off and do it if you don't believe me, but I don't
> really want to hold things up while you do.  If you go off and
> demonstrate that reladdr really is better, its not that hard to
> revert.
>
Whether it will be hard to revert or not depends on how much of the code
added in the meantime assumes that reladdr is no longer a thing, and you
cannot give any guarantees about that.  I don't have any obligation to
spend time substantiating or refuting your assumption, and I wont if
your plan is to make things unnecessarily difficult for me in the
meantime.

> One other side-note: One of the substantial advantages to this series is
> the removal of the uniform size arrays.  If you do decide to make a
> different prototype, please make that a major design point.  When we get
> uniforms in from SPIR-V they may come in std140 packed and with just an
> offset and maybe range.  The size arrays are implicitly tied to variables.
> I want them gone.
>
>> > You could say it is getting used, but it's current use is just an
>> > intermediate to something less expressive than MOV_INDIRECT.  In that
>> > sense, we are expressing *more* with MOV_INDIRECT than we were before
>> > since we are also expressing in directing on registers and not just
>> > pull constants.
>> >
>> Yes, but MOV_INDIRECT is a dead end in the long-term, and the same
>> functionality you implement here could have been implemented based on
>> the preexisting model.
>>
>> > Also, while reladdr is "expressive", it doesn't actually represent the
>> > hardware.  If we wanted that, we would expose the address register file
>> > (which I did consider doing).  However, that comes with it's own set of
>> > issues.
>> >
>> It represents the hardware more accurately than MOV_INDIRECT in the
>> following sense: It allows us to represent a wide subset of the indirect
>> addressing capabilities of the hardware down to the generator, on any
>> instruction, source or destination we can possibly address indirectly.
>> MOV_INDIRECT on the other hand allows us to represent a MOV instruction
>> with indirectly-addressed source *only*.
>>
>> >> > The MOV_INDIRECT instruction, on the other hand, is something we can
>> >> > always do because it falls in the "one indirect" case above.  The
>> >> > reladdr abstraction also suffers from the recursive reladdr problem
>> >> > which MOV_INDIRECT neatly side-steps by design.
>> >>
>> >> I don't think the reladdr "problem" requires any fundamentally more
>> >> difficult logic in optimization passes than what you need in order to
>> >> handle MOV_INDIRECT instructions reasonably.
>> >
>> > No, it just requires rewriting them because iterating over sources just
>> > became recursive.
>> >
>> That's quite an overstatement, most optimization passes only need to
>> care about the top-level register of a source or destination, because
>> wherever a reladdr is present it's not statically determined so you're
>> unlikely to be able to optimize through it anyway regardless of whether
>> you recurse or not.  Cases where you actually want to iterate over each
>> register used by an instruction are a minority and don't require
>> fundamentally changing the optimization pass, the iteration can be
>> factored out as in the straightforward for_each_use() helper I
>> originally implemented to solve a different problem but which happens to
>> handle arbitrarily-nested reladdr as one would expect.
>>
>> >> > We can't even use reladdr past initial NIR -> FS stage because none
> of
>> >> > the optimization passes know what to do with it (and no, I don't want
>> >> > to add reladdr support to any of them).
>> >> >
>> >> > Yes, removing reladdr makes the IR "less expressive", but we're not
>> > really
>> >> > using that expressiveness nor is reladdr the abstraction we want.
> If we
>> >> > want to start taking more advantage of relative addressing in the
>> > future,
>> >> > we can come up with a better abstraction then.
>> >> >
>> >> > --Jason
>> >> >
>> >> >> > ---
>> >> >> >  src/mesa/drivers/dri/i965/brw_fs.cpp  | 7 +------
>> >> >> >  src/mesa/drivers/dri/i965/brw_ir_fs.h | 5 +----
>> >> >> >  2 files changed, 2 insertions(+), 10 deletions(-)
>> >> >> >
>> >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> >> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> >> >> > index 7cc03c5..786c5fb 100644
>> >> >> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> >> >> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> >> >> > @@ -433,7 +433,6 @@ fs_reg::fs_reg(struct ::brw_reg reg) :
>> >> >> >  {
>> >> >> >     this->reg_offset = 0;
>> >> >> >     this->subreg_offset = 0;
>> >> >> > -   this->reladdr = NULL;
>> >> >> >     this->stride = 1;
>> >> >> >     if (this->file == IMM &&
>> >> >> >         (this->type != BRW_REGISTER_TYPE_V &&
>> >> >> > @@ -448,7 +447,6 @@ fs_reg::equals(const fs_reg &r) const
>> >> >> >  {
>> >> >> >     return (this->backend_reg::equals(r) &&
>> >> >> >             subreg_offset == r.subreg_offset &&
>> >> >> > -           !reladdr && !r.reladdr &&
>> >> >> >             stride == r.stride);
>> >> >> >  }
>> >> >> >
>> >> >> > @@ -4716,9 +4714,7 @@
>> > fs_visitor::dump_instruction(backend_instruction
>> >> > *be_inst, FILE *file)
>> >> >> >           break;
>> >> >> >        case UNIFORM:
>> >> >> >           fprintf(file, "u%d", inst->src[i].nr +
>> >> > inst->src[i].reg_offset);
>> >> >> > -         if (inst->src[i].reladdr) {
>> >> >> > -            fprintf(file, "+reladdr");
>> >> >> > -         } else if (inst->src[i].subreg_offset) {
>> >> >> > +         if (inst->src[i].subreg_offset) {
>> >> >> >              fprintf(file, "+%d.%d", inst->src[i].reg_offset,
>> >> >> >                      inst->src[i].subreg_offset);
>> >> >> >           }
>> >> >> > @@ -4829,7 +4825,6 @@
>> >> > fs_visitor::get_instruction_generating_reg(fs_inst *start,
>> >> >> >  {
>> >> >> >     if (end == start ||
>> >> >> >         end->is_partial_write() ||
>> >> >> > -       reg.reladdr ||
>> >> >> >         !reg.equals(end->dst)) {
>> >> >> >        return NULL;
>> >> >> >     } else {
>> >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h
>> >> > b/src/mesa/drivers/dri/i965/brw_ir_fs.h
>> >> >> > index c3eec2e..e4f20f4 100644
>> >> >> > --- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
>> >> >> > +++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
>> >> >> > @@ -58,8 +58,6 @@ public:
>> >> >> >      */
>> >> >> >     int subreg_offset;
>> >> >> >
>> >> >> > -   fs_reg *reladdr;
>> >> >> > -
>> >> >> >     /** Register region horizontal stride */
>> >> >> >     uint8_t stride;
>> >> >> >  };
>> >> >> > @@ -136,8 +134,7 @@ component(fs_reg reg, unsigned idx)
>> >> >> >  static inline bool
>> >> >> >  is_uniform(const fs_reg &reg)
>> >> >> >  {
>> >> >> > -   return (reg.stride == 0 || reg.is_null()) &&
>> >> >> > -          (!reg.reladdr || is_uniform(*reg.reladdr));
>> >> >> > +   return (reg.stride == 0 || reg.is_null());
>> >> >> >  }
>> >> >> >
>> >> >> >  /**
>> >> >> > --
>> >> >> > 2.5.0.400.gff86faf
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > mesa-dev mailing list
>> >> >> > mesa-dev at lists.freedesktop.org
>> >> >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151217/721c8ed8/attachment.sig>


More information about the mesa-dev mailing list