[Bug 92760] Add FP64 support to the i965 shader backends

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Nov 4 00:32:19 PST 2015


https://bugs.freedesktop.org/show_bug.cgi?id=92760

--- Comment #5 from Iago Toral <itoral at igalia.com> ---
(In reply to Connor Abbott from comment #4)
> I've pushed my latest changes to the i965-fp64-v3 branch. It's based on the
> patch series I sent out a little while ago to make glsl-to-nir use the
> visitor. It's not cleaned-up, but it's got a decent amount of things working.

Thanks Connor! Our idea was to start by going through all the commits in this
branch to get familiar with the current state of things. Right now I think we
need to understand better how the hardware deals with doubles before pursuing
anything specific.

Also, since we are going to go through all your patches I think we should also
take that opportunity to clean-up the series and make it more manageable, or
were you planning to work on further clean-ups / fixes yourself in the next
days?

> Sam and Iago: for the most part, the core NIR changes are pretty stable. The
> one thing that Jason mentioned needed to be done there (the nir_algebraic
> bitwidth verifier) is something that would be best for Jason or me to do,
> since we've discussed it extensively in-person and know what it needs to
> look like. It's also not essential to get it working (it's basically just a
> tool to validate at compile-time that certain asserts/validator failures
> won't happen when rewriting expressions with different bit-widths). What do
> you want to do there?

Yeah, that's fine by me. If you know exactly what you want to do there it
probably makes more sense that you take on it. As you day, it is not something
we need to make progress on other things anyway.

> Now, I think the main thing remaining to get the existing tests to pass
> (besides implementing ftrunc and friends) is to make our handling of
> splitting dvec3's and dvec4's into dvec2's and doubles more robust. Right
> now, it mostly works, but falls over whenever we hit a case where we need to
> create a vec3. There are two solutions I can see for this:

Ok. My understanding is that this issue is specific to the vec4 backend only,
right?

> 1. Special-case lower_vec_to_movs to skip over vec3/vec4 with doubles, and
> handle vec3/vec4 specially in the i965 vec4 backend.
> 
> 2. Add the ability to handle math on dvec3's and dvec4's, but with a
> writemask that's never larger than 2 (i.e. we never have to emit more than
> one thing). lower_vec_to_movs may coalesce movs away, so we would have to do
> handle this in the backend in order to handle what NIR currently gives us.

So if I understand #2 right you mean, for example, splitting a vec4 addition in
NIR into 4 additions that operate on individual components of the vec4 operands
each (and I guess write the results into different regs as well).

> #1 seems like a better solution to me, but unfortunately we'd have to modify
> lower_vec_to_movs in a somewhat ugly, i965-specific way.

I suppose that if we want to go with #1 we could do that behavior configurable
so that NIR users can decide if that suits them or not.

> There's another problem with the vec4 backend and exec_masks, but I want to
> send another email to see what other people who know about i965 have to say.

Yeah, I saw that in the mailing list. Thanks for starting that discussion.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20151104/68bc606a/attachment.html>


More information about the intel-3d-bugs mailing list