[Mesa-dev] [PATCH 00/41] Welcome back Matt!

Tue Sep 23 12:06:49 PDT 2014

On Tue, Sep 23, 2014 at 11:43 AM, Matt Turner <mattst88 at gmail.com> wrote:

> On Sat, Sep 20, 2014 at 10:22 AM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> > This series does a bunch of refactoring of the i965 fs backend IR to add
> > concepts of register width and instruction execution size.  There's more
> to
> > be done yet, but this gets us most of the way there.  It also removes the
> > assumption that scalar values are always 1 register in SIMD8 and 2
> > registers in SIMD16.  In particular, we get the following:
> >
> >  1) No more assumption about everything being 1 register.  This allows us
> >     to allocate odd numbers of registers in SIMD16 which is needed for
> some
> >     payloads.  Also, it should make implementing fp64 much easier because
> >     we can now sanely registers of size 2 in SIMD8 and size 4 in SIMD16.
> >     There's a little more work to be don there, but this should take care
> >     of a lot of it.
> >
> >  2) We can now do other instruction widths with relative ease.  The
> >     compiler now detects, based on register widths, the execution size of
> >     the instruction and passes it down to the generator.  One example of
> >     this is the patches in this series for UNTYPED_ATOMIC and
> >     UNTYPED_SURFACE_READ where part of setting up the payload is to do an
> >     8-wide move to fill a register with 0 and then a 1-wide move to set
> one
> >     particular component.  We can now simply do this at the fs level and
> it
> >     will be get translated down to the correct assembly and properly
> >     handled by the compiler optimizations.  There is more work to be done
> >     here at the generator level, but this series is already long enough
> >
> >  3) Thanks to the above mentioned things, we can easily do send from GRF
> >     for FB writes.  One of the major blockers here before was that the
> >     beginning of the FB write message was anywhere between 0 and 4
> >     registers regardless of whether you are in SIMD8 or SIMD16.  Due to
> the
> >     implicit register doubling in SIMD16, it would have been a real pain
> to
> >     implement this properly.  Now, it's trivial.
> >
> > I could go on about other changes, but those are the major ones.
>
> This all sounds great, Jason. I'm really happy with how you split the
> patches out. It made reviewing this amazingly easy.
>
> I'm made a relatively quick review, and it looks good to me. (I'm
> operating under the assumption that what you said below about piglit
> passing is indicative of there not being bugs :)

Heh... If only that were true...  I seem to have killed ILK and I'm not
sure why.  I'm going to look into that today and see if I can get it
patched up.

> I've sent a bunch of
> smallish comments. Nothing major. I guess patch 41 is in flux though
> based on Connor's review.
>

I think Ken's comment demonstrates that it's not an issue.  I'm convinced
at any rate.  Not sure If I can count on Connor getting back to me right
away either. :-)

> I suppose the next steps are to sort out the preliminary 12 patch
> series (which I think you said you were going to revise?). I think the
> ball's in your court there.
>

Yeah, I need to come up with something better for compact_virtual_grfs and
I think there was another thing or two to clean up.  Also, the
split_virtual_grfs patch still needs a good review.  You looked at it and
gave me a non-comment.

>
> So, you can apply my R-b to the first 40 patches with comments
> addressed. When this series goes in, I think we should preserve the
> commit info of the squash patches (i.e., actually squash instead of
> fixup), like in commit 79d77b38.
>
> > The requisite Shader DB results:
> >
> > total instructions in shared programs: 4999994 -> 4971746 (-0.56%)
> > instructions in affected programs:     959392 -> 931144 (-2.94%)
> > GAINED:                                138
> > LOST:                                  71
>
> Any ideas about the lost programs?
>

I think some (may be 12 or so) are because if an optimization pass that's
failing (there were a few programs that grew by 1 or 2 instructions).  The
other 60 or so seem to be because freedom in register allocation isn't
always a good thing.  Prior to doing send-from-GRF for FB writes and this
last rebase, I had only 4 programs in all of shader-db that had any
difference in the number of instructions, so I think I was generating
basically identical programs except for register allocation.  However, I
gained some programs and lost about 65.  I think this is a case of those
programs being right on the edge of what our allocator could do and
tweaking things pushed them over the edge. :-(

--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140923/0b461253/attachment.html>