[Mesa-dev] [PATCH] i965/fs: Extend the live ranges of VGRFs which leave loops
Jason Ekstrand
jason at jlekstrand.net
Tue Oct 10 20:10:01 UTC 2017
On Tue, Oct 10, 2017 at 9:16 AM, Connor Abbott <cwabbott0 at gmail.com> wrote:
> I'm a little nervous about this, because really, the only solution to
> this problem is to ignore all non-WE_all definitions of all variables
> in liveness analysis. For example, in something like:
>
> vec4 color2 = ...
> if (...) {
> color2 = texture();
> }
>
> texture() can also overwrite inactive channels of color2. We happen to
> get this right because we turn live ranges into live intervals without
> holes, but I can't come up with a good reason why that would save us
> in all cases except the one in this patch -- which makes me worry that
> we'll find yet another case where there's a similar problem. I think
> it would be clearer if we what I said above, i.e. ignore all
> non-WE_all definitions, which will make things much worse, but then
> apply Curro's patch which will return things to pretty much how they
> were before, except this case will be fixed and maybe some other cases
> we haven't thought of.
>
What you're suggesting may actually be less code and is arguably better in
terms of being more straightforward. However, I think intervals plus this
patch is equivalent. Curro's patch + always-partial will cause us to start
the live range at the IP where it first *may* be defined and we keep the
behavior of ending the live range at the last IP where some reachable
instruction may use it. With my patch + Curro's, we start the live range
at the IP where it is first defined which will always all places it *may*
be defined unless there is a back edge. If there is a back-edge, I pull
the live range up across the back edge.
That said, I think I agree with you that my solution treats it as a special
case and not a general problem. However, I'm relucant to just change
liveness analysis to assume partial writes because we use it for more than
computing variable interference. In particular, we use it for dead code
elimination and the concept of partial/complete writes is crucial there.
I've got another patch cooking which I'll send out soon which should make
you happier with it.
--Jason
>
> On Thu, Oct 5, 2017 at 2:52 PM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> > No Shader-db changes.
> >
> > Cc: mesa-stable at lists.freedesktop.org
> > ---
> > src/intel/compiler/brw_fs_live_variables.cpp | 55
> ++++++++++++++++++++++++++++
> > 1 file changed, 55 insertions(+)
> >
> > diff --git a/src/intel/compiler/brw_fs_live_variables.cpp
> b/src/intel/compiler/brw_fs_live_variables.cpp
> > index c449672..380060d 100644
> > --- a/src/intel/compiler/brw_fs_live_variables.cpp
> > +++ b/src/intel/compiler/brw_fs_live_variables.cpp
> > @@ -223,6 +223,61 @@ fs_live_variables::compute_start_end()
> > }
> > }
> > }
> > +
> > + /* Due to the explicit way the SIMD data is handled on GEN, we need
> to be a
> > + * bit more careful with live ranges and loops. Consider the
> following
> > + * example:
> > + *
> > + * vec4 color2;
> > + * while (1) {
> > + * vec4 color = texture();
> > + * if (...) {
> > + * color2 = color * 2;
> > + * break;
> > + * }
> > + * }
> > + * gl_FragColor = color2;
> > + *
> > + * In this case, the definition of color2 dominates the use because
> the
> > + * loop only has the one exit. This means that the live range
> interval for
> > + * color2 goes from the statement in the if to it's use below the
> loop.
> > + * Now suppose that the texture operation has a header register that
> gets
> > + * assigned one of the registers used for color2. If the loop
> condition is
> > + * non-uniform and some of the threads will take the and others will
> > + * continue. In this case, the next pass through the loop, the
> WE_all
> > + * setup of the header register will stomp the disabled channels of
> color2
> > + * and corrupt the value.
> > + *
> > + * This same problem can occur if you have a mix of 64, 32, and
> 16-bit
> > + * registers because the channels do not line up or if you have a
> SIMD16
> > + * program and the first half of one value overlaps the second half
> of the
> > + * other.
> > + *
> > + * To solve this problem, we take any VGRFs whose live ranges cross
> the
> > + * while instruction of a loop and extend their live ranges to the
> top of
> > + * the loop. This more accurately models the hardware because the
> value in
> > + * the VGRF needs to be carried through subsequent loop iterations
> in order
> > + * to remain valid when we finally do break.
> > + */
> > + foreach_block (block, cfg) {
> > + if (block->end()->opcode != BRW_OPCODE_WHILE)
> > + continue;
> > +
> > + /* This is a WHILE instrution. Find the DO block. */
> > + bblock_t *do_block = NULL;
> > + foreach_list_typed(bblock_link, child_link, link,
> &block->children) {
> > + if (child_link->block->start_ip < block->end_ip) {
> > + assert(do_block == NULL);
> > + do_block = child_link->block;
> > + }
> > + }
> > + assert(do_block);
> > +
> > + for (int i = 0; i < num_vars; i++) {
> > + if (start[i] < block->end_ip && end[i] > block->end_ip)
> > + start[i] = MIN2(start[i], do_block->start_ip);
> > + }
> > + }
> > }
> >
> > fs_live_variables::fs_live_variables(fs_visitor *v, const cfg_t *cfg)
> > --
> > 2.5.0.400.gff86faf
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20171010/80a8bcad/attachment-0001.html>
More information about the mesa-dev
mailing list