<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez <span dir="ltr"><<a href="mailto:currojerez@riseup.net" target="_blank">currojerez@riseup.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> writes:<br> <br> > I'm still a little pensive. But<br> ><br> > Reviewed-by: Jason Ekstrand <<a href="mailto:jason.ekstrand@intel.com">jason.ekstrand@intel.com</a>><br> ><br> </span>Thanks.<br> <span class=""><br> > Now for a little aside. I have come to the conclusion that I made a grave<br> > mistake when I did the LOAD_PAYLOAD stuff. In retrospect, I should have<br> > just subclassed fs_inst for load_payload. The problem is that we need to<br> > snag a bunch of information for the sources when we create the<br> > load_payload. In particular, we need to know the width of the source so<br> > that we know how much space it consumes in the payload and we need to know<br> > the information required to properly re-create the mov such as<br> > force_sechalf and force_writemask_all. Really, in order to do things<br> > properly, we need to gather this information *before* we do any<br> > optimizations. The nasty pile of code that you're editing together with<br> > the "effective_width" parameter is a lame attempt to capture/reconstruct<br> > this information. Really, we should just subclass, capture the information<br> > up-front, and do it properly.<br> ><br> </span>Yes, absolutely, this lowering pass is a real mess. There are four more<br> unreviewed patches in the mailing list fixing bugs of the metadata<br> guessing of lower_load_payload() [1][2][3][4], you may want to take a<br> look at them. There are more bugs I'm aware of but it didn't seem<br> practical to fix them.<br></blockquote><div><br></div><div>Yeah, Matt pointed me at those. I'll give them a look later today.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> That said, I don't think it would be worth subclassing fs_inst.<br> <br> My suggestion would have been to keep it simple and lower LOAD_PAYLOAD<br> into a series of MOVs with force_writemask_all enabled in all cases,<br> simply rely on the optimizer to eliminate those where possible. Then<br> get rid of the metadata and effective_width guessing. Instead agree on<br> immediates and uniforms being exec_size-wide by convention<br> (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),<br> then prevent constant propagation from propagating an immediate load<br> into a LOAD_PAYLOAD if it would lead to a change in the semantics of the<br> program, and maybe just run constant propagation after lowering so we<br> can be sure those cases are cleaned up properly where register coalesce<br> isn't enough.<br></blockquote><div><br></div><div>First off, simply setting force_writemask_all isn't an option. Consider the following scenario:<br><br></div><div>a = foo;<br>if (bar) {<br></div><div> b = 7;<br></div><div>} else {<br></div><div> use(a);<br>}<br></div><div>use(b);<br></div><div><br></div><div>If "use(a)" is the last use of the variable a, then the live ranges of a and b don't actually over-lap and we can assign a and b to the same register. However, if force_writemask_all is set on b, it will destroy the value in a before its last use. Right now, we don't actually do this because our liveness analysis pass flattens everything so it will think that b and a over-lap even though they don't. However, if we ever decide to make the liveness information more accurate, having a pile of force_writemask_all instructions will be a major problem. So, while it is *technically* safe for now, it's a really bad idea in the long-term.<br><br></div><div>Regarding the other suggestion of just requiring width == exec_size for immediates and uniforms, that seems pretty reasonable to me. I'd like to know what it will do to optimizations, but it seems ok initially. I'm still a bigger fan of just subclassing and stashing everything we need to know up-front. If we do it right, the only things that will actually need to know about the subclass are the function for creating a LOAD_PAYLOAD and lower_load_payloads.<br><br></div><div>--Jason<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br> [1] <a href="http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html" target="_blank">http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html</a><br> [2] <a href="http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html" target="_blank">http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html</a><br> [3] <a href="http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html" target="_blank">http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html</a><br> [4] <a href="http://lists.freedesktop.org/archives/mesa-dev/2015-February/076098.html" target="_blank">http://lists.freedesktop.org/archives/mesa-dev/2015-February/076098.html</a><br> <div class="HOEnZb"><div class="h5"><br> <br> > --Jason<br> ><br> > On Thu, Feb 19, 2015 at 1:53 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br> > wrote:<br> ><br> >><br> >><br> >> On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez <<a href="mailto:currojerez@riseup.net">currojerez@riseup.net</a>><br> >> wrote:<br> >><br> >>> Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> writes:<br> >>><br> >>> > On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez <<br> >>> <a href="mailto:currojerez@riseup.net">currojerez@riseup.net</a>><br> >>> > wrote:<br> >>> ><br> >>> >> Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> writes:<br> >>> >><br> >>> >> > On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez <<br> >>> <a href="mailto:currojerez@riseup.net">currojerez@riseup.net</a>><br> >>> >> > wrote:<br> >>> >> ><br> >>> >> >> Hey Matt,<br> >>> >> >><br> >>> >> >> Matt Turner <<a href="mailto:mattst88@gmail.com">mattst88@gmail.com</a>> writes:<br> >>> >> >><br> >>> >> >> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <<br> >>> >> <a href="mailto:currojerez@riseup.net">currojerez@riseup.net</a>><br> >>> >> >> wrote:<br> >>> >> >> >> MRFs cannot be read from anyway so they cannot possibly be a<br> >>> valid<br> >>> >> >> >> source of LOAD_PAYLOAD.<br> >>> >> >> >> ---<br> >>> >> >> ><br> >>> >> >> > The function only seems to test inst->dst.file == MRF. I don't<br> >>> see any<br> >>> >> >> > code for handling MRF sources. What am I missing?<br> >>> >> >><br> >>> >> >> That test is for "handling" MRF sources -- More precisely, it's<br> >>> >> >> collecting the writemask and half flags for MRF writes, which can<br> >>> only<br> >>> >> >> possibly be useful if we're going to use them later on to read<br> >>> something<br> >>> >> >> out of an MRF into a payload, which we shouldn't be doing in the<br> >>> first<br> >>> >> >> place.<br> >>> >> >><br> >>> >> >> Aside from simplifying the function somewhat, that allows us to<br> >>> drop the<br> >>> >> >> 16 register gap reserved for MRFs at register offset zero, what will<br> >>> >> >> allow us to drop the vgrf_to_reg[] offset calculation completely<br> >>> (also<br> >>> >> >> in split_virtual_grfs()) in a future patch (not sent for review<br> >>> yet).<br> >>> >> >><br> >>> >> ><br> >>> >> > No, we do read from MRF's sort-of... Send messages have an implicit<br> >>> >> "read"<br> >>> >> > from an MRF.<br> >>> >><br> >>> >> Heh, and that's pretty much the only way you "read" from it.<br> >>> >><br> >>> >> > This was written precicely so that we could use LOAD_PAYLOAD<br> >>> >> > to build MRF payloads. We do on pre-GEN6.<br> >>> >> ><br> >>> >> I'm aware, but you don't need any of this meta-data to LOAD_PAYLOAD<br> >>> >> *into* an MRF, and LOAD_PAYLOAD with an MRF as source should be illegal<br> >>> >> anyway.<br> >>> >><br> >>> ><br> >>> > And no one is using it that way. All of the metadata checks you are<br> >>> > deleting are checks on the *destination*.<br> >>> ><br> >>><br> >>> Didn't you write this code yourself? The only use for the collected<br> >>> metadata is initializing the instruction flags of the MOVs subsequent<br> >>> LOAD_PAYLOAD instructions are lowered to, based on the metadata already<br> >>> collected for its source registers, which can never be MRFs, so the<br> >>> metadata you collect from MRF writes is never actually used.<br> >>><br> >><br> >> Right... I misred something initially. Yes, we should never be tracking<br> >> MRF's as a source of a LOAD_PAYLOAD. I'll give it a better look a bit<br> >> later, but it looks better.<br> >><br> </div></div></blockquote></div><br></div></div>