[Mesa-dev] [RFC 2/2] i965: add support for image AoA

Francisco Jerez currojerez at riseup.net
Sun Aug 16 03:15:13 PDT 2015


Timothy Arceri <t_arceri at yahoo.com.au> writes:

> On Sat, 2015-08-15 at 17:33 +0300, Francisco Jerez wrote:
>> Timothy Arceri <t_arceri at yahoo.com.au> writes:
>> 
>> > On Wed, 2015-08-12 at 19:39 +1000, Timothy Arceri wrote:
>> > > Cc: Francisco Jerez <currojerez at riseup.net>
>> > > ---
>> > >  This patch works for direct indexing of images, but I'm having some 
>> > > trouble
>> > >  getting indirect to work.
>> > > 
>> > >  For example for:
>> > >  tests/spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore
>> > >  -non-const-uniform-index.shader_test
>> > > 
>> > >  Which has and image writeonly uniform image2D tex[2][2]
>> > > 
>> > >  Indirect indexing will work for tex[0][0] and text[0][1] but not for
>> > >  tex[1][0] and tex[1][1] they seem to always end up refering to the
>> > >  image in 0.
>> > 
>> > Just to add some more to this, I'm pretty sure my code is generating the
>> > correct offsets. If I hardcode the img_offset offset to 72 to get the 
>> > uniform
>> > value of tex[1][0] I get the value I expected, but if I set image.reladdr 
>> > to a
>> > register that contains 72 I don't what I expect.
>> > 
>> > If I change the array to be a single dimension e.g. tex[4] and I hardcode 
>> > the
>> > offset as described above then it works as expected for both scenarios, it
>> > also works if I split the offset across img_offset and image.reladdr, 
>> > there is
>> > something going on with image.reladdr for multi-dimensional arrays that I 
>> > can'
>> > t quite put my finger on.
>> > 
>> > Any hints appreciated.
>> > 
>> Odd, can you attach an assembly dump?
>> 
>> Thanks.
>
> I wasn't sure what would be the most helpful so I've attached a few different
> dumps.
>
> image_dump = 1D array indirect piglit test, without this patch (Result=pass)
> image_dump2 = 2D array indirect piglit test, with this patch (Result=fail)
> image_dump3 = 1D array indirect piglit test, with this patch (Result=pass)
>
> image_dump4 = 1D array indirect piglit test, hardcoded register with 72 offset
>   (Result=pass)
> image_dump5 = 2D array indirect piglit test, hardcoded register with 72 offset
> (Result=fail)
>
> image_dump4 vs image_dump5 is interesting because the output matches which is
> what I would have expected, but the result differs. Then with the offset below
> it seems to work as expected suggesting everything else is setup correctly.
>
> image_dump6 = 1D array indirect piglit test, hardcoded 72 offset in img_offset
>    (Result=pass)
> image_dump7 = 2D array indirect piglit test, hardcoded 72 offset in img_offset
>    (Result=pass)
>  

Yeah... The assembly output looks correct to me in the cases that fail.
AFAICT what all the failing cases have in common is that the array of
image uniforms is demoted to pull constants (see demote_pull_constants()
and move_uniform_array_access_to_pull_constants()), so most likely
something goes wrong while doing that for multidimensional arrays.  I
have the suspicion that this is the source of the problem:

| --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
| +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
| @@ -226,6 +226,7 @@ fs_visitor::nir_setup_uniform(nir_variable *var)
|        * our name.
|        */
|     unsigned index = var->data.driver_location;
| +   bool set_image_location = true;
|     for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
|        struct gl_uniform_storage *storage = &shader_prog->UniformStorage[u];
|
| @@ -244,7 +245,13 @@ fs_visitor::nir_setup_uniform(nir_variable *var)
|            * because their size is driver-specific, so we need to allocate
|            * space for them here at the end of the parameter array.
|            */
| -         var->data.driver_location = uniforms;
| +         if (set_image_location) {
| +            /* For arrays of arrays we only want to set this once at the base
| +             * location.
| +             */
| +            var->data.driver_location = uniforms;
| +            set_image_location = false;
| +         }
|           param_size[uniforms] =
|              BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);

The assumption that this code was previously making was that a given
image array would be stored in a single gl_uniform_storage entry,
otherwise param_size[var->data.driver_location] won't match the real
size of the array and move_uniform_array_access_to_pull_constants() will
have no idea how large the array really is, so it will only be able to
move part of the array to the pull constant buffer and later on the
indirect array access will read past the end of the initialized area of
the constant buffer.

I think something like this would do what you want:

|     if (var->type->without_array()->is_image()) {
|        /* Images don't get a valid location assigned by nir_lower_io()
|         * because their size is driver-specific, so we need to allocate
|         * space for them here at the end of the parameter array.
|         */
|        var->data.driver_location = uniforms;
|     }
|[...]
|     for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
|        if (storage->type->is_image()) {
|           param_size[var->data.driver_location] +=
|              BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);
|           setup_image_uniform_values(storage);
|        } else {
|[...]
|        }              
|     }

>[...]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150816/cbae44d9/attachment-0001.sig>


More information about the mesa-dev mailing list