[Bug 92760] Add FP64 support to the i965 shader backends

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sat Dec 5 08:45:29 PST 2015


--- Comment #25 from Iago Toral <itoral at igalia.com> ---
(In reply to Connor Abbott from comment #24)
> (In reply to Iago Toral from comment #23)
> > We spent some time working on fixing code that needs the execsize hack. For
> > now we decided to only fix the cases that are absolutely required, which
> > involve instructions with a dst.width of 4 (we don't currently generate
> > instructions on doubles with a width < 4). This makes the number of changes
> > required for gen7/8 pretty small (fixing this for widths < 4 would require a
> > huge amount of changes everywhere...). This keeps non-fp64 piglit tests on
> > gen7/8 and fp64 tests on gen8 happy with a small number of changes.
> > Basically with this we need to keep the exec_size hack as is for gen < 7 and
> > tweak it for gen7/8 so it only acts on instructions with a dst.width < 4.
> > 
> > I would not try to remove the execsize hack entirely (that is cover the
> > cases where dst.witdh < 4), because that would mean a really huge number of
> > things throughout the driver and across all hw gens.
> > 
> > We could try to avoid special-casing gen7/8 by fixing instructions with a
> > dst.width of 4 in previous gens too, but this will require a lot more
> > patches, so I am not sure that it is worth it. Specifically:
> > 
> > * In gen6 there are instances of instructions injected at the vec4_visitor
> > level that need the exec_size hack to set the execution size to 4. The
> > problem here is that vec4_instruction does not provide means to alter the
> > execution size of the instruction, so that support would have to be added
> > (or, alternatively, write generator opcodes that do that as needed). We
> > think fixing gen6 would involve at least 5 patches, so it is not too bad.
> In this case, I think the right thing to do would be to push the decision of
> whether to do exec_size 4 or 8 based on dst.width into the vec4 generator --
> essentially, move the hack up one level.
> > 
> > * In gen5 the problem is that the strip-and-fans and clipping modules are
> > full with code that mixes width4 and width8 instructions, so it requires to
> > tweak execsizes in a fair amount of places. I have patches to do this that
> > in fact change the default execution size for the strip and fans to 4 and
> > fix the cases that need 8, since that seemed more reasonable to limit the
> > number of necessary changes. This involves something like ~10 patches, but
> > the changes look a bit scarier to me (piglit seems to be happy though).
> > 
> > * We don't have gen4 or gen9 hardware available, so we would need help
> > fixing these paths. I would expect a few more patches required for gen4 at
> > least.
> > 
> > So the summary is that if we don't want to special-case the execsize for
> > gen7/8, we will need a significant number of changes, help from Intel to
> > track down issues in gen4 and gen9 and cross fingers that piglit actually
> > exercises all the paths we need to tweak, so all things considered I wonder
> > if we should just special case the execsize hack for gen7/8 for now and
> > forget about other gens for now, since we are not planning fp64 for them yet
> > anyway.
> > 
> > What do you think?
> Ugh, I didn't realize we were using this so much... nice job looking into
> it! Indeed, it seems like special-casing this for gen7/gen8 is the easiest
> thing that at least gets us moving in the right direction. I would like to
> get others' opinions on this, though. I think you should send out an RFC
> that does the minimum necessary to solve the problem (i.e. only fix the
> width-4 case on gen7/8), explaining the problem and what you came across
> when trying to solve it (basically this comment).

Sure, I'll do that.

> I was thinking about this, and there is another way to fix the problem.
> Since we don't actually use vstride or width in the destination, we could
> avoid dividing the width and vstride by 2 for destination registers.
> However, I don't think we want to go in this direction, since this is a step
> toward conflating dst.width and exec_size when, at least in FS, there is
> already a clearly-defined notion of what the exec_size should be that's part
> of the instruction. Jason has already done a lot of work towards not
> inferring exec_size from dst.width (although for somewhat different reasons,
> and at a higher level), and we should try to get rid of the assumption at a
> lower level too.

Yeah, I agree. 

Thanks Connor!

You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20151205/61b40f08/attachment.html>

More information about the intel-3d-bugs mailing list