[Mesa-dev] [PATCH 04/15] i965/fs: Get rid of reladdr
Jason Ekstrand
jason at jlekstrand.net
Thu Dec 17 10:04:21 PST 2015
On Dec 17, 2015 9:10 AM, "Francisco Jerez" <currojerez at riseup.net> wrote:
>
> Jason Ekstrand <jason at jlekstrand.net> writes:
>
> > On Dec 14, 2015 6:24 AM, "Francisco Jerez" <currojerez at riseup.net>
wrote:
> >>
> >> Jason Ekstrand <jason at jlekstrand.net> writes:
> >>
> >> > On Dec 11, 2015 5:44 AM, "Francisco Jerez" <currojerez at riseup.net>
> > wrote:
> >> >>
> >> >> Jason Ekstrand <jason at jlekstrand.net> writes:
> >> >>
> >> >> > On Dec 10, 2015 6:58 AM, "Francisco Jerez" <currojerez at riseup.net>
> >> > wrote:
> >> >> >>
> >> >> >> Jason Ekstrand <jason at jlekstrand.net> writes:
> >> >> >>
> >> >> >> > We aren't using it anymore.
> >> >> >>
> >> >> >> It seems useful to me to be able to represent indirect access as
> > part
> >> > of
> >> >> >> any instruction source or destination register.
> >> >> >>
> >> >> >> The following:
> >> >> >>
> >> >> >> | mov_indirect g0, g1, a0
> >> >> >> | foo g2, g0
> >> >> >>
> >> >> >> and the converse case with indirect destination offset (which you
> > don't
> >> >> >> seem to represent currently) can be implemented by the hardware
more
> >> >> >> efficiently using a single instruction in certain cases. The
> > current
> >> > IR
> >> >> >> is able to represent what the hardware can do, but supporting the
> >> >> >> MOV_INDIRECT instruction only would force us to keep the
indirection
> >> >> >> separate from the instruction that uses it, so it seems like a
less
> >> >> >> expressive representation to me than the current approach, unless
> >> > you're
> >> >> >> willing to add _INDIRECT variants of most hardware opcodes.
> >> >> >
> >> >> > Yes and, mostly, no. Yes, you can put an indirect on almost
anything
> >> > but
> >> >> > it has substantial restrictions:
> >> >> >
> >> >> Yes, I'm aware of the restrictions of indirect addressing on Gen
> >> >> hardware. The fact that reladdr can represent addressing modes
which
> >> >> aren't directly implemented in the hardware doesn't invalidate the
> >> >> abstraction, nor implies that optimization and translation passes
must
> >> >> be made aware of such restrictions. It just means that our
abstraction
> >> >> is a superset of the hardware indirect addressing scheme, which is
fine
> >> >> given a lowering pass to convert indirect addressing into a legal
form.
> >> >>
> >> >> Your proposal instead goes the opposite direction and replaces the
> >> >> preexisting abstraction with a new abstraction which can only
represent
> >> >> a tiny subset of the functionality of hardware instructions, which I
> >> >> don't think is acceptable for a low-level IR.
> >> >>
> >> >> > 1) Destination indirects must be uniform (I'm only 95% sure this
is
> > the
> >> >> > case)
> >> >> >
> >> >> > 2) We only have 8 address subregisters on HSW and prior and 16 on
BDW
> >> > which
> >> >> > means:
> >> >> >
> >> >>
> >> >> That's the kind of thing that SIMD lowering would be able to take
care
> >> >> of easily.
> >> >
> >> > Yes, and we use it for MOV_INDIRECT.
> >> >
> >> Right, and there's no reason why it couldn't be used for reladdr in the
> >> same way to handle the above restrictions.
> >>
> >> >> > a) All indirects must be uniform OR
> >> >> >
> >> >> > b) We must be on Broadwell, have only two indirects, and split
> > the
> >> >> > instruction OR
> >> >> >
> >> >> > c) We can only have one indirect (and still split the
> > instruction on
> >> >> > HSW)
> >> >> >
> >> >> > So, yes, "everthing can be uniform", but not in real life.
> >> >>
> >> >> > The reladdr abstraction, while maybe ok for a high-level IR,
doesn't
> >> >> > accurately represent the hardware at all because it can't
represent
> >> >> > any of the restrictions without piles of helper functions and
checks.
> >> >>
> >> >> The reladdr abstraction can represent the full flexibility of the
> >> >> hardware, while MOV_INDIRECT cannot, that makes reladdr a more
accurate
> >> >> low-level representation of the program. For that reason I'd argue
the
> >> >> exact opposite: MOV_INDIRECT would be a great abstraction and a
welcome
> >> >> simplification for a high-level IR (e.g. NIR), in which nothing is
lost
> >> >> by sacrificing the expressiveness of individual instructions in
favour
> >> >> of instruction composition, because the back-end can always combine
> >> >> multiple high-level instructions into a single low-level instruction
> >> >> where the hardware ISA allows, as long as the low-level IR is able
to
> >> >> represent the full functionality of hardware instructions (IOW the
> >> >> mapping between low-level IR instructions and hardware instructions
is
> >> >> surjective), which is further from being the case after this series.
> >> >
> >> > Flexibility and expressiveness isn't always a good thing.
> >>
> >> Expressiveness is a good thing as long as it's required to represent
the
> >> capabilities of the hardware.
> >
> > As long as it's required to represent those capabilities that we care
> > about, yes. However, the hardware can do many things that either aren't
> > useful for us or take so much effort for the little benefit you gain,
that
> > it's not worth it. I'm arguing that saving a single MOV in the few
places
> > that we do in directs isn't worth the complexity.
> >
> >> > Yes, reladdr is more "expressive", but it's not getting used because
> >> > that expressiveness makes it a pain.
> >>
> >> AFAIK nobody has bothered to implement it completely down to the
> >> back-end generator so it's not a wonder that it's not extensively used.
> >> I'd like to see some evidence that it would actually be a pain
> >> (e.g. some proof of concept implementation) before we give up and
commit
> >> to an inherently more limited model. If you already have such a proof
> >> of concept, please share it. If you don't and are fully convinced that
> >> it will be a pain I suggest you hold up this series for a while and let
> >> me write the proof of concept -- Doing something with the expectation
> >> that it's going to be a pain is the best way to make sure it actually
> >> becomes a pain. Worst-case scenario you're right and it does become a
> >> pain, at which point I come back and review this series. Best-case
> >> scenario you're wrong and we don't give up useful functionality of the
> >> IR based on an expectation.
> >
> > Do I have a proof-of-concept in code, no. However, I've run through it
in
> > my head and I have a pretty good idea what it would look like.
>
> Ah, that's what I guessed. I don't think that your previous claim that
> supporting indirect addressing based on the current reladdr approach
> would require rewriting optimization passes to be recursive demonstrates
> a particularly clear notion of what it could look like.
That's fair. I'm sorry if that want particularly well communicated. I
don't always make a distinction between things in just throwing out there
avg things I've given substantial thought.
In the case of uniforms, I've been thinking about and poking at how to best
do it for a while. Thinking about relative addressing has been part of
that. Initially, I did try to use reladdr but I realized fairly quickly
that it was going to turn into a mess, so I went towards a different
solution.
> > You are free to go off and do it if you don't believe me, but I don't
> > really want to hold things up while you do. If you go off and
> > demonstrate that reladdr really is better, its not that hard to
> > revert.
> >
> Whether it will be hard to revert or not depends on how much of the code
> added in the meantime assumes that reladdr is no longer a thing, and you
> cannot give any guarantees about that. I don't have any obligation to
> spend time substantiating or refuting your assumption, and I wont if
> your plan is to make things unnecessarily difficult for me in the
> meantime.
You've seen how big this patch is. Almost all of the current code assumes
reladdr isn't a thing. Other than these patches, I doubt we'll add that
much more. That said, Ken had already used MOV_INDIRECT to do indirect
inputs for tess and maybe geometry.
--Jason
> > One other side-note: One of the substantial advantages to this series is
> > the removal of the uniform size arrays. If you do decide to make a
> > different prototype, please make that a major design point. When we get
> > uniforms in from SPIR-V they may come in std140 packed and with just an
> > offset and maybe range. The size arrays are implicitly tied to
variables.
> > I want them gone.
> >
> >> > You could say it is getting used, but it's current use is just an
> >> > intermediate to something less expressive than MOV_INDIRECT. In that
> >> > sense, we are expressing *more* with MOV_INDIRECT than we were before
> >> > since we are also expressing in directing on registers and not just
> >> > pull constants.
> >> >
> >> Yes, but MOV_INDIRECT is a dead end in the long-term, and the same
> >> functionality you implement here could have been implemented based on
> >> the preexisting model.
> >>
> >> > Also, while reladdr is "expressive", it doesn't actually represent
the
> >> > hardware. If we wanted that, we would expose the address register
file
> >> > (which I did consider doing). However, that comes with it's own set
of
> >> > issues.
> >> >
> >> It represents the hardware more accurately than MOV_INDIRECT in the
> >> following sense: It allows us to represent a wide subset of the
indirect
> >> addressing capabilities of the hardware down to the generator, on any
> >> instruction, source or destination we can possibly address indirectly.
> >> MOV_INDIRECT on the other hand allows us to represent a MOV instruction
> >> with indirectly-addressed source *only*.
> >>
> >> >> > The MOV_INDIRECT instruction, on the other hand, is something we
can
> >> >> > always do because it falls in the "one indirect" case above. The
> >> >> > reladdr abstraction also suffers from the recursive reladdr
problem
> >> >> > which MOV_INDIRECT neatly side-steps by design.
> >> >>
> >> >> I don't think the reladdr "problem" requires any fundamentally more
> >> >> difficult logic in optimization passes than what you need in order
to
> >> >> handle MOV_INDIRECT instructions reasonably.
> >> >
> >> > No, it just requires rewriting them because iterating over sources
just
> >> > became recursive.
> >> >
> >> That's quite an overstatement, most optimization passes only need to
> >> care about the top-level register of a source or destination, because
> >> wherever a reladdr is present it's not statically determined so you're
> >> unlikely to be able to optimize through it anyway regardless of whether
> >> you recurse or not. Cases where you actually want to iterate over each
> >> register used by an instruction are a minority and don't require
> >> fundamentally changing the optimization pass, the iteration can be
> >> factored out as in the straightforward for_each_use() helper I
> >> originally implemented to solve a different problem but which happens
to
> >> handle arbitrarily-nested reladdr as one would expect.
> >>
> >> >> > We can't even use reladdr past initial NIR -> FS stage because
none
> > of
> >> >> > the optimization passes know what to do with it (and no, I don't
want
> >> >> > to add reladdr support to any of them).
> >> >> >
> >> >> > Yes, removing reladdr makes the IR "less expressive", but we're
not
> >> > really
> >> >> > using that expressiveness nor is reladdr the abstraction we want.
> > If we
> >> >> > want to start taking more advantage of relative addressing in the
> >> > future,
> >> >> > we can come up with a better abstraction then.
> >> >> >
> >> >> > --Jason
> >> >> >
> >> >> >> > ---
> >> >> >> > src/mesa/drivers/dri/i965/brw_fs.cpp | 7 +------
> >> >> >> > src/mesa/drivers/dri/i965/brw_ir_fs.h | 5 +----
> >> >> >> > 2 files changed, 2 insertions(+), 10 deletions(-)
> >> >> >> >
> >> >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> >> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> >> >> > index 7cc03c5..786c5fb 100644
> >> >> >> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> >> >> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> >> >> > @@ -433,7 +433,6 @@ fs_reg::fs_reg(struct ::brw_reg reg) :
> >> >> >> > {
> >> >> >> > this->reg_offset = 0;
> >> >> >> > this->subreg_offset = 0;
> >> >> >> > - this->reladdr = NULL;
> >> >> >> > this->stride = 1;
> >> >> >> > if (this->file == IMM &&
> >> >> >> > (this->type != BRW_REGISTER_TYPE_V &&
> >> >> >> > @@ -448,7 +447,6 @@ fs_reg::equals(const fs_reg &r) const
> >> >> >> > {
> >> >> >> > return (this->backend_reg::equals(r) &&
> >> >> >> > subreg_offset == r.subreg_offset &&
> >> >> >> > - !reladdr && !r.reladdr &&
> >> >> >> > stride == r.stride);
> >> >> >> > }
> >> >> >> >
> >> >> >> > @@ -4716,9 +4714,7 @@
> >> > fs_visitor::dump_instruction(backend_instruction
> >> >> > *be_inst, FILE *file)
> >> >> >> > break;
> >> >> >> > case UNIFORM:
> >> >> >> > fprintf(file, "u%d", inst->src[i].nr +
> >> >> > inst->src[i].reg_offset);
> >> >> >> > - if (inst->src[i].reladdr) {
> >> >> >> > - fprintf(file, "+reladdr");
> >> >> >> > - } else if (inst->src[i].subreg_offset) {
> >> >> >> > + if (inst->src[i].subreg_offset) {
> >> >> >> > fprintf(file, "+%d.%d", inst->src[i].reg_offset,
> >> >> >> > inst->src[i].subreg_offset);
> >> >> >> > }
> >> >> >> > @@ -4829,7 +4825,6 @@
> >> >> > fs_visitor::get_instruction_generating_reg(fs_inst *start,
> >> >> >> > {
> >> >> >> > if (end == start ||
> >> >> >> > end->is_partial_write() ||
> >> >> >> > - reg.reladdr ||
> >> >> >> > !reg.equals(end->dst)) {
> >> >> >> > return NULL;
> >> >> >> > } else {
> >> >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h
> >> >> > b/src/mesa/drivers/dri/i965/brw_ir_fs.h
> >> >> >> > index c3eec2e..e4f20f4 100644
> >> >> >> > --- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
> >> >> >> > +++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
> >> >> >> > @@ -58,8 +58,6 @@ public:
> >> >> >> > */
> >> >> >> > int subreg_offset;
> >> >> >> >
> >> >> >> > - fs_reg *reladdr;
> >> >> >> > -
> >> >> >> > /** Register region horizontal stride */
> >> >> >> > uint8_t stride;
> >> >> >> > };
> >> >> >> > @@ -136,8 +134,7 @@ component(fs_reg reg, unsigned idx)
> >> >> >> > static inline bool
> >> >> >> > is_uniform(const fs_reg ®)
> >> >> >> > {
> >> >> >> > - return (reg.stride == 0 || reg.is_null()) &&
> >> >> >> > - (!reg.reladdr || is_uniform(*reg.reladdr));
> >> >> >> > + return (reg.stride == 0 || reg.is_null());
> >> >> >> > }
> >> >> >> >
> >> >> >> > /**
> >> >> >> > --
> >> >> >> > 2.5.0.400.gff86faf
> >> >> >> >
> >> >> >> > _______________________________________________
> >> >> >> > mesa-dev mailing list
> >> >> >> > mesa-dev at lists.freedesktop.org
> >> >> >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151217/d090558d/attachment-0001.html>
More information about the mesa-dev
mailing list