[Mesa-dev] [RFC 2/2] i965: add support for image AoA
Timothy Arceri
t_arceri at yahoo.com.au
Mon Aug 17 05:40:32 PDT 2015
On Sun, 2015-08-16 at 13:15 +0300, Francisco Jerez wrote:
> Timothy Arceri <t_arceri at yahoo.com.au> writes:
>
> > On Sat, 2015-08-15 at 17:33 +0300, Francisco Jerez wrote:
> > > Timothy Arceri <t_arceri at yahoo.com.au> writes:
> > >
> > > > On Wed, 2015-08-12 at 19:39 +1000, Timothy Arceri wrote:
> > > > > Cc: Francisco Jerez <currojerez at riseup.net>
> > > > > ---
> > > > > This patch works for direct indexing of images, but I'm having some
> > > > > trouble
> > > > > getting indirect to work.
> > > > >
> > > > > For example for:
> > > > > tests/spec/arb_arrays_of_arrays/execution/image_store/basic
> > > > > -imageStore
> > > > > -non-const-uniform-index.shader_test
> > > > >
> > > > > Which has and image writeonly uniform image2D tex[2][2]
> > > > >
> > > > > Indirect indexing will work for tex[0][0] and text[0][1] but not
> > > > > for
> > > > > tex[1][0] and tex[1][1] they seem to always end up refering to the
> > > > > image in 0.
> > > >
> > > > Just to add some more to this, I'm pretty sure my code is generating
> > > > the
> > > > correct offsets. If I hardcode the img_offset offset to 72 to get the
> > > > uniform
> > > > value of tex[1][0] I get the value I expected, but if I set
> > > > image.reladdr
> > > > to a
> > > > register that contains 72 I don't what I expect.
> > > >
> > > > If I change the array to be a single dimension e.g. tex[4] and I
> > > > hardcode
> > > > the
> > > > offset as described above then it works as expected for both
> > > > scenarios, it
> > > > also works if I split the offset across img_offset and image.reladdr,
> > > > there is
> > > > something going on with image.reladdr for multi-dimensional arrays
> > > > that I
> > > > can'
> > > > t quite put my finger on.
> > > >
> > > > Any hints appreciated.
> > > >
> > > Odd, can you attach an assembly dump?
> > >
> > > Thanks.
> >
> > I wasn't sure what would be the most helpful so I've attached a few
> > different
> > dumps.
> >
> > image_dump = 1D array indirect piglit test, without this patch
> > (Result=pass)
> > image_dump2 = 2D array indirect piglit test, with this patch (Result=fail)
> > image_dump3 = 1D array indirect piglit test, with this patch (Result=pass)
> >
> > image_dump4 = 1D array indirect piglit test, hardcoded register with 72
> > offset
> > (Result=pass)
> > image_dump5 = 2D array indirect piglit test, hardcoded register with 72
> > offset
> > (Result=fail)
> >
> > image_dump4 vs image_dump5 is interesting because the output matches which
> > is
> > what I would have expected, but the result differs. Then with the offset
> > below
> > it seems to work as expected suggesting everything else is setup
> > correctly.
> >
> > image_dump6 = 1D array indirect piglit test, hardcoded 72 offset in
> > img_offset
> > (Result=pass)
> > image_dump7 = 2D array indirect piglit test, hardcoded 72 offset in
> > img_offset
> > (Result=pass)
> >
>
> Yeah... The assembly output looks correct to me in the cases that fail.
> AFAICT what all the failing cases have in common is that the array of
> image uniforms is demoted to pull constants (see demote_pull_constants()
> and move_uniform_array_access_to_pull_constants()),
> so most likely
> something goes wrong while doing that for multidimensional arrays. I
> have the suspicion that this is the source of the problem:
>
> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > @@ -226,6 +226,7 @@ fs_visitor::nir_setup_uniform(nir_variable *var)
> > * our name.
> > */
> > unsigned index = var->data.driver_location;
> > + bool set_image_location = true;
> > for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
> > struct gl_uniform_storage *storage = &shader_prog
> > ->UniformStorage[u];
> >
> > @@ -244,7 +245,13 @@ fs_visitor::nir_setup_uniform(nir_variable *var)
> > * because their size is driver-specific, so we need to allocate
> > * space for them here at the end of the parameter array.
> > */
> > - var->data.driver_location = uniforms;
> > + if (set_image_location) {
> > + /* For arrays of arrays we only want to set this once at the
> > base
> > + * location.
> > + */
> > + var->data.driver_location = uniforms;
> > + set_image_location = false;
> > + }
> > param_size[uniforms] =
> > BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);
>
> The assumption that this code was previously making was that a given
> image array would be stored in a single gl_uniform_storage entry,
> otherwise param_size[var->data.driver_location] won't match the real
> size of the array and move_uniform_array_access_to_pull_constants() will
> have no idea how large the array really is, so it will only be able to
> move part of the array to the pull constant buffer and later on the
> indirect array access will read past the end of the initialized area of
> the constant buffer.
Thanks for the reply. I'm still getting to know the i965 backend it would have
taken me a while to get to this point, make a lot more sense now. The code
below does the trick all of the indirect tests now pass.
Thanks so much for your help :)
>
> I think something like this would do what you want:
>
> > if (var->type->without_array()->is_image()) {
> > /* Images don't get a valid location assigned by nir_lower_io()
> > * because their size is driver-specific, so we need to allocate
> > * space for them here at the end of the parameter array.
> > */
> > var->data.driver_location = uniforms;
> > }
> > [...]
> > for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
> > if (storage->type->is_image()) {
> > param_size[var->data.driver_location] +=
> > BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);
> > setup_image_uniform_values(storage);
> > } else {
> > [...]
> > }
> > }
>
> > [...]
More information about the mesa-dev
mailing list