[Mesa-dev] [PATCH 00/41] Welcome back Matt!
Jason Ekstrand
jason at jlekstrand.net
Tue Sep 23 12:06:49 PDT 2014
On Tue, Sep 23, 2014 at 11:43 AM, Matt Turner <mattst88 at gmail.com> wrote:
> On Sat, Sep 20, 2014 at 10:22 AM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> > This series does a bunch of refactoring of the i965 fs backend IR to add
> > concepts of register width and instruction execution size. There's more
> to
> > be done yet, but this gets us most of the way there. It also removes the
> > assumption that scalar values are always 1 register in SIMD8 and 2
> > registers in SIMD16. In particular, we get the following:
> >
> > 1) No more assumption about everything being 1 register. This allows us
> > to allocate odd numbers of registers in SIMD16 which is needed for
> some
> > payloads. Also, it should make implementing fp64 much easier because
> > we can now sanely registers of size 2 in SIMD8 and size 4 in SIMD16.
> > There's a little more work to be don there, but this should take care
> > of a lot of it.
> >
> > 2) We can now do other instruction widths with relative ease. The
> > compiler now detects, based on register widths, the execution size of
> > the instruction and passes it down to the generator. One example of
> > this is the patches in this series for UNTYPED_ATOMIC and
> > UNTYPED_SURFACE_READ where part of setting up the payload is to do an
> > 8-wide move to fill a register with 0 and then a 1-wide move to set
> one
> > particular component. We can now simply do this at the fs level and
> it
> > will be get translated down to the correct assembly and properly
> > handled by the compiler optimizations. There is more work to be done
> > here at the generator level, but this series is already long enough
> >
> > 3) Thanks to the above mentioned things, we can easily do send from GRF
> > for FB writes. One of the major blockers here before was that the
> > beginning of the FB write message was anywhere between 0 and 4
> > registers regardless of whether you are in SIMD8 or SIMD16. Due to
> the
> > implicit register doubling in SIMD16, it would have been a real pain
> to
> > implement this properly. Now, it's trivial.
> >
> > I could go on about other changes, but those are the major ones.
>
> This all sounds great, Jason. I'm really happy with how you split the
> patches out. It made reviewing this amazingly easy.
>
> I'm made a relatively quick review, and it looks good to me. (I'm
> operating under the assumption that what you said below about piglit
> passing is indicative of there not being bugs :)
Heh... If only that were true... I seem to have killed ILK and I'm not
sure why. I'm going to look into that today and see if I can get it
patched up.
> I've sent a bunch of
> smallish comments. Nothing major. I guess patch 41 is in flux though
> based on Connor's review.
>
I think Ken's comment demonstrates that it's not an issue. I'm convinced
at any rate. Not sure If I can count on Connor getting back to me right
away either. :-)
> I suppose the next steps are to sort out the preliminary 12 patch
> series (which I think you said you were going to revise?). I think the
> ball's in your court there.
>
Yeah, I need to come up with something better for compact_virtual_grfs and
I think there was another thing or two to clean up. Also, the
split_virtual_grfs patch still needs a good review. You looked at it and
gave me a non-comment.
>
> So, you can apply my R-b to the first 40 patches with comments
> addressed. When this series goes in, I think we should preserve the
> commit info of the squash patches (i.e., actually squash instead of
> fixup), like in commit 79d77b38.
>
> > The requisite Shader DB results:
> >
> > total instructions in shared programs: 4999994 -> 4971746 (-0.56%)
> > instructions in affected programs: 959392 -> 931144 (-2.94%)
> > GAINED: 138
> > LOST: 71
>
> Any ideas about the lost programs?
>
I think some (may be 12 or so) are because if an optimization pass that's
failing (there were a few programs that grew by 1 or 2 instructions). The
other 60 or so seem to be because freedom in register allocation isn't
always a good thing. Prior to doing send-from-GRF for FB writes and this
last rebase, I had only 4 programs in all of shader-db that had any
difference in the number of instructions, so I think I was generating
basically identical programs except for register allocation. However, I
gained some programs and lost about 65. I think this is a case of those
programs being right on the edge of what our allocator could do and
tweaking things pushed them over the edge. :-(
--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140923/0b461253/attachment.html>
More information about the mesa-dev
mailing list