[Bug 92760] Add FP64 support to the i965 shader backends

Fri Nov 6 06:31:54 PST 2015

https://bugs.freedesktop.org/show_bug.cgi?id=92760

--- Comment #7 from Connor Abbott <cwabbott0 at gmail.com> ---
(In reply to Iago Toral from comment #5)
> (In reply to Connor Abbott from comment #4)
> > I've pushed my latest changes to the i965-fp64-v3 branch. It's based on the
> > patch series I sent out a little while ago to make glsl-to-nir use the
> > visitor. It's not cleaned-up, but it's got a decent amount of things working.
> 
> Thanks Connor! Our idea was to start by going through all the commits in
> this branch to get familiar with the current state of things. Right now I
> think we need to understand better how the hardware deals with doubles
> before pursuing anything specific.
> 
> Also, since we are going to go through all your patches I think we should
> also take that opportunity to clean-up the series and make it more
> manageable, or were you planning to work on further clean-ups / fixes
> yourself in the next days?

No, go ahead... I'm going to be quite busy until next week, so I won't be able
to do much.

> 
> > Sam and Iago: for the most part, the core NIR changes are pretty stable. The
> > one thing that Jason mentioned needed to be done there (the nir_algebraic
> > bitwidth verifier) is something that would be best for Jason or me to do,
> > since we've discussed it extensively in-person and know what it needs to
> > look like. It's also not essential to get it working (it's basically just a
> > tool to validate at compile-time that certain asserts/validator failures
> > won't happen when rewriting expressions with different bit-widths). What do
> > you want to do there?
> 
> Yeah, that's fine by me. If you know exactly what you want to do there it
> probably makes more sense that you take on it. As you day, it is not
> something we need to make progress on other things anyway.
> 
> > Now, I think the main thing remaining to get the existing tests to pass
> > (besides implementing ftrunc and friends) is to make our handling of
> > splitting dvec3's and dvec4's into dvec2's and doubles more robust. Right
> > now, it mostly works, but falls over whenever we hit a case where we need to
> > create a vec3. There are two solutions I can see for this:
> 
> Ok. My understanding is that this issue is specific to the vec4 backend
> only, right?

Yes, that's correct.

> 
> > 1. Special-case lower_vec_to_movs to skip over vec3/vec4 with doubles, and
> > handle vec3/vec4 specially in the i965 vec4 backend.
> > 
> > 2. Add the ability to handle math on dvec3's and dvec4's, but with a
> > writemask that's never larger than 2 (i.e. we never have to emit more than
> > one thing). lower_vec_to_movs may coalesce movs away, so we would have to do
> > handle this in the backend in order to handle what NIR currently gives us.
> 
> So if I understand #2 right you mean, for example, splitting a vec4 addition
> in NIR into 4 additions that operate on individual components of the vec4
> operands each (and I guess write the results into different regs as well).

Not quite. The vec4 backend needs to split things into vec2's. As I explained,
I wrote a NIR pass that does that for us, but we're possibly still left with
things like

dvec4 foo;
foo.xz = bar.xz + baz.zw;

and I meant that we could just make the vec4 backend deal with something like
that directly, by moving the sources to temporary dvec2's, doing the operation,
and then emitting two mov's to the upper and lower parts of the destination.

> 
> > #1 seems like a better solution to me, but unfortunately we'd have to modify
> > lower_vec_to_movs in a somewhat ugly, i965-specific way.
> 
> I suppose that if we want to go with #1 we could do that behavior
> configurable so that NIR users can decide if that suits them or not.
> 
> > There's another problem with the vec4 backend and exec_masks, but I want to
> > send another email to see what other people who know about i965 have to say.
> 
> Yeah, I saw that in the mailing list. Thanks for starting that discussion.

Yeah. curro discovered some really weird fp64 bugs with writemasking in Align16
mode, which make it seem like the best solution might be to give up and
scalarize 64-bit things instead.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20151106/e1aac447/attachment-0001.html>