[Mesa-dev] [PATCH 09/21] i965/fs: Rework representation of fragment output locations in NIR.

Tue Aug 2 17:55:16 UTC 2016

On Sunday, July 31, 2016 12:22:40 PM PDT Francisco Jerez wrote:
> Kenneth Graunke <kenneth at whitecape.org> writes:
> 
> > On Friday, July 22, 2016 8:59:03 PM PDT Francisco Jerez wrote:
> >> The problem with the current approach is that driver output locations
> >> are represented as a linear offset within the nir_outputs array, which
> >> makes it rather difficult for the back-end to figure out what color
> >> output and index some nir_intrinsic_load/store_output was meant for,
> >> because the offset of a given output within the nir_output array is
> >> dependent on the type and size of all previously allocated outputs.
> >> Instead this defines the driver location of an output to be the pair
> >> formed by its GLSL-assigned location and index (I've borrowed the
> >> bitfield macros from brw_defines.h in order to represent the pair of
> >> integers as a single scalar value that can be assigned to
> >> nir_variable_data::driver_location).  nir_assign_var_locations is no
> >> longer useful for fragment outputs.
> >> 
> >> Because fragment outputs are now allocated independently rather than
> >> within the nir_outputs array, the get_frag_output() helper becomes
> >> necessary in order to obtain the right temporary register for a given
> >> location-index pair.
> >> 
> >> The type_size helper passed to nir_lower_io is now type_size_dvec4
> >> rather than type_size_vec4_times_4 so that output array offsets are
> >> provided in terms of whole array elements rather than in terms of
> >> scalar components (dvec4 is the largest vector type supported by the
> >> GLSL so this will cause all individual fragment outputs to have a size
> >> of one regardless of the type).
> >
> > I don't think you should need type_size_dvec4 - double-precision
> > fragment shader outputs don't exist.  GL_ARB_gpu_shader_fp64 says:
> >
> >     (3) Should double-precision fragment shader outputs be supported?
> >
> >       RESOLVED:  Not in this extension.  Note that we don't have
> >       double-precision framebuffer formats to accept such values.
> >
> > Why not just use type_size_vec4?  With that changed, this patch would
> > get my Reviewed-by.
> >
> 
> Yeah, I'm aware of that restriction (c.f. PATCH 7), but the point here
> was to get output offsets counted in whole array elements regardless of
> the type, type_size_dvec4 is strictly better for that purpose than
> type_size_vec4, because dvec4 is the largest GLSL vector type, and they
> are otherwise equivalent when used on types smaller than a dvec4.  A
> type_size_vectors helper may make sense instead though.

Ah.  I was concerned about the inconsistency in this patch - you
use type_size_dvec4 for lowering, but allocate the registers with
type_size_vec4_times_4, which both count doubles differently.  It
works because there are no doubles, of course, but it seemed like
using type_size_vec4 for both would be more consistent.

Then again, in the next patch, you drop the type_size_vec4_times_4
call in favor of allocating VGRFs of size 4 directly.

I think I see what you mean, though.  I'd always thought about
FRAG_RESULT_DATA* as vec4-based.  But it's not, exactly...array
indexes correspond to render target indexes...so we want each
array element to be 1.  We could use a type size function that
counts 1 per array element, and ignores structs completely.  But
that's basically what type_size_dvec4 does, and it already exists.

I suppose in the presence of theoretical double outputs, we'd still
want to count by 1.  We'd just want to allocate registers of size 8
if the intrinsic's source's bit size is 64 instead of 32.

You win :)

Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160802/e235bc51/attachment.sig>