[Mesa-dev] [PATCH 10/20] i965/fs: Implement image load, store and atomic.

Francisco Jerez currojerez at riseup.net
Fri Jul 31 06:15:15 PDT 2015


Jason Ekstrand <jason at jlekstrand.net> writes:

> On Thu, Jul 23, 2015 at 4:38 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>> Jason Ekstrand <jason at jlekstrand.net> writes:
>>
>>> This all looks correct as far as I can tell.  However, I'm very
>>> concerned about the number of checks such as
>>> has_matching_typed_format() that are built-in to the compiler (via
>>> surface_builder) where we then go on to do something that is highly
>>> dependant on state setup doing the exact same check (not via
>>> surface_builder) and doing the right thing.  Another example would be
>>> the interplay with has_split_bit_layout.  How do we know these won't
>>> get out of sync?
>>>
>> They cannot get out of sync because the policy which surface format to
>> use for a given GLSL format is shared between the compiler and the
>> state-setup code, see brw_lower_mesa_image_format() in "i965: Implement
>> surface state set-up for shader images.".
>
> I came back to this today and looked at it again.  I think my main
> objection is that all of these metadata functions work in terms of a
> devinfo and a format and then immediately call
> brw_lower_mesa_image_format.  Why don't they work instead in terms
> comparing the lowered format to the nominal format?  Then it would be
> a lot more obvious that you're doing a transformation from one format
> to another.  As is, it's a bunch of magic "what do I have to do next"
> queries 

I don't think any of these functions implies what to do next, they all
encode clearly defined device-specific properties of the formats:

 - Does the device support any format with the same bit layout?
   (has_supported_bit_layout())

 - Does the device support any format with the same bit layout *and*
   encoding? (is_conversion_trivial())

 - Does the device support a typed format of the same size?
   (has_matching_typed_format())

 - Does the device split the pixel data into a number of discontiguous
   segments? (has_split_bit_layout())

 - Does the device return garbage in the unused bits when reading from
   the given format? (has_undefined_high_bits())

 - Do all fields of the format have the same size? (This is not even
   device-specific)

 - Is the format represented as a signed integer in memory? (Ditto)

You may have noticed that except for the last two the natural subject
and object of all questions are "device" and "format" respectively --
I'm not sure I see how rephrasing them to involve the lowered format
(that in fact a number of them don't care about) would make any of these
questions easier to understand.

> which, while in the same namespace, are in a different file.

I happen to have cleaned that up to be in the same file a short while
ago -- They're no longer used by anything else outside of
brw_fs_surface_builder.cpp:
http://cgit.freedesktop.org/~currojerez/mesa/commit/?h=image-load-store-lower&id=6b83876a343cabb89083212d20d5b3721b10c200

> It would also be more efficient because you would only call
> brw_lower_mesa_image_format once.

A more elegant way to "fix" that would be to mark
brw_lower_mesa_image_format() as pure, or enable LTO -- It's not like it
would make any measurable difference in practice though.

>
> Does that make more sense?
> --Jason
>
>> Thanks.
>>
>>> On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>>> v2: Drop VEC4 suport.
>>>> v3: Rebase.
>>>> ---
>>>>  .../drivers/dri/i965/brw_fs_surface_builder.cpp    | 216 +++++++++++++++++++++
>>>>  src/mesa/drivers/dri/i965/brw_fs_surface_builder.h |  17 ++
>>>>  2 files changed, 233 insertions(+)
>>>>
>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
>>>> index ea1c4aa..46b449f 100644
>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
>>>> @@ -587,3 +587,219 @@ namespace {
>>>>        }
>>>>     }
>>>>  }
>>>> +
>>>> +namespace brw {
>>>> +   namespace image_access {
>>>> +      /**
>>>> +       * Load a vector from a surface of the given format and dimensionality
>>>> +       * at the given coordinates.
>>>> +       */
>>>> +      fs_reg
>>>> +      emit_image_load(const fs_builder &bld,
>>>> +                      const fs_reg &image, const fs_reg &addr,
>>>> +                      mesa_format format, unsigned dims)
>>>> +      {
>>>> +         using namespace image_format_info;
>>>> +         using namespace image_format_conversion;
>>>> +         using namespace image_validity;
>>>> +         using namespace image_coordinates;
>>>> +         using namespace surface_access;
>>>> +         const brw_device_info *devinfo = bld.shader->devinfo;
>>>> +         const mesa_format lower_format =
>>>> +            brw_lower_mesa_image_format(devinfo, format);
>>>> +         fs_reg tmp;
>>>> +
>>>> +         if (has_matching_typed_format(devinfo, format)) {
>>>> +            /* Hopefully we get here most of the time... */
>>>> +            tmp = emit_typed_read(bld, image, addr, dims,
>>>> +                                  _mesa_format_num_components(lower_format));
>>>> +         } else {
>>>> +            /* Untyped surface reads return 32 bits of the surface per
>>>> +             * component, without any sort of unpacking or type conversion,
>>>> +             */
>>>> +            const unsigned size = _mesa_get_format_bytes(format) / 4;
>>>> +
>>>> +            /* they don't properly handle out of bounds access, so we have to
>>>> +             * check manually if the coordinates are valid and predicate the
>>>> +             * surface read on the result,
>>>> +             */
>>>> +            const brw_predicate pred =
>>>> +               emit_bounds_check(bld, image, addr, dims);
>>>> +
>>>> +            /* and they don't know about surface coordinates, we need to
>>>> +             * convert them to a raw memory offset.
>>>> +             */
>>>> +            const fs_reg laddr = emit_address_calculation(bld, image, addr, dims);
>>>> +
>>>> +            tmp = emit_untyped_read(bld, image, laddr, 1, size, pred);
>>>> +
>>>> +            /* An out of bounds surface access should give zero as result. */
>>>> +            for (unsigned c = 0; c < 4; ++c)
>>>> +               set_predicate(pred, bld.SEL(offset(tmp, bld, c),
>>>> +                                           offset(tmp, bld, c), fs_reg(0)));
>>>> +         }
>>>> +
>>>> +         /* Set the register type to D instead of UD if the data type is
>>>> +          * represented as a signed integer in memory so that sign extension
>>>> +          * is handled correctly by unpack.
>>>> +          */
>>>> +         if (needs_sign_extension(format))
>>>> +            tmp = retype(tmp, BRW_REGISTER_TYPE_D);
>>>> +
>>>> +         if (!has_supported_bit_layout(devinfo, format)) {
>>>> +            /* Unpack individual vector components from the bitfield if the
>>>> +             * hardware is unable to do it for us.
>>>> +             */
>>>> +            if (has_split_bit_layout(devinfo, format))
>>>> +               tmp = emit_pack(bld, tmp, get_bit_shifts(lower_format),
>>>> +                               get_bit_widths(lower_format));
>>>> +            else
>>>> +               tmp = emit_unpack(bld, tmp, get_bit_shifts(format),
>>>> +                                 get_bit_widths(format));
>>>> +
>>>> +         } else if ((needs_sign_extension(format) &&
>>>> +                     !is_conversion_trivial(devinfo, format)) ||
>>>> +                    has_undefined_high_bits(devinfo, format)) {
>>>> +            /* Perform a trivial unpack even though the bit layout matches in
>>>> +             * order to get the most significant bits of each component
>>>> +             * initialized properly.
>>>> +             */
>>>> +            tmp = emit_unpack(bld, tmp, color_u(0, 32, 64, 96),
>>>> +                              get_bit_widths(format));
>>>> +         }
>>>> +
>>>> +         if (!_mesa_is_format_integer(format)) {
>>>> +            if (is_conversion_trivial(devinfo, format)) {
>>>> +               /* Just need to cast the vector to the target type. */
>>>> +               tmp = retype(tmp, BRW_REGISTER_TYPE_F);
>>>> +            } else {
>>>> +               /* Do the right sort of type conversion to float. */
>>>> +               if (_mesa_get_format_datatype(format) == GL_FLOAT)
>>>> +                  tmp = emit_convert_from_float(
>>>> +                     bld, tmp, get_bit_widths(format));
>>>> +               else
>>>> +                  tmp = emit_convert_from_scaled(
>>>> +                     bld, tmp, get_bit_widths(format),
>>>> +                     _mesa_is_format_signed(format));
>>>> +            }
>>>> +         }
>>>> +
>>>> +         /* Initialize missing components of the result. */
>>>> +         return emit_pad(bld, tmp, get_bit_widths(format));
>>>> +      }
>>>> +
>>>> +      /**
>>>> +       * Store a vector in a surface of the given format and dimensionality at
>>>> +       * the given coordinates.
>>>> +       */
>>>> +      void
>>>> +      emit_image_store(const fs_builder &bld, const fs_reg &image,
>>>> +                       const fs_reg &addr, const fs_reg &src,
>>>> +                       mesa_format format, unsigned dims)
>>>> +      {
>>>> +         using namespace image_format_info;
>>>> +         using namespace image_format_conversion;
>>>> +         using namespace image_validity;
>>>> +         using namespace image_coordinates;
>>>> +         using namespace surface_access;
>>>> +         const brw_device_info *devinfo = bld.shader->devinfo;
>>>> +
>>>> +         if (format == MESA_FORMAT_NONE) {
>>>> +            /* We don't know what the format is, but that's fine because it
>>>> +             * implies write-only access, and typed surface writes are always
>>>> +             * able to take care of type conversion and packing for us.
>>>> +             */
>>>> +            emit_typed_write(bld, image, addr, src, dims, 4);
>>>> +
>>>> +         } else {
>>>> +            const mesa_format lower_format =
>>>> +               brw_lower_mesa_image_format(devinfo, format);
>>>> +            fs_reg tmp = src;
>>>> +
>>>> +            if (!is_conversion_trivial(devinfo, format)) {
>>>> +               /* Do the right sort of type conversion. */
>>>> +               if (_mesa_get_format_datatype(format) == GL_FLOAT)
>>>> +                  tmp = emit_convert_to_float(bld, tmp, get_bit_widths(format));
>>>> +
>>>> +               else if (_mesa_is_format_integer(format))
>>>> +                  tmp = emit_convert_to_integer(bld, tmp, get_bit_widths(format),
>>>> +                                                _mesa_is_format_signed(format));
>>>> +
>>>> +               else
>>>> +                  tmp = emit_convert_to_scaled(bld, tmp, get_bit_widths(format),
>>>> +                                               _mesa_is_format_signed(format));
>>>> +            }
>>>> +
>>>> +            /* We're down to bit manipulation at this point. */
>>>> +            tmp = retype(tmp, BRW_REGISTER_TYPE_UD);
>>>> +
>>>> +            if (!has_supported_bit_layout(devinfo, format)) {
>>>> +               /* Pack the vector components into a bitfield if the hardware
>>>> +                * is unable to do it for us.
>>>> +                */
>>>> +               if (has_split_bit_layout(devinfo, format))
>>>> +                  tmp = emit_unpack(bld, tmp, get_bit_shifts(lower_format),
>>>> +                                    get_bit_widths(lower_format));
>>>> +
>>>> +               else
>>>> +                  tmp = emit_pack(bld, tmp, get_bit_shifts(format),
>>>> +                                  get_bit_widths(format));
>>>> +            }
>>>> +
>>>> +            if (has_matching_typed_format(devinfo, format)) {
>>>> +               /* Hopefully we get here most of the time... */
>>>> +               emit_typed_write(bld, image, addr, tmp, dims,
>>>> +                                _mesa_format_num_components(lower_format));
>>>> +
>>>> +            } else {
>>>> +               /* Untyped surface writes store 32 bits of the surface per
>>>> +                * component, without any sort of packing or type conversion,
>>>> +                */
>>>> +               const unsigned size = _mesa_get_format_bytes(format) / 4;
>>>> +
>>>> +               /* they don't properly handle out of bounds access, so we have
>>>> +                * to check manually if the coordinates are valid and predicate
>>>> +                * the surface write on the result,
>>>> +                */
>>>> +               const brw_predicate pred =
>>>> +                  emit_bounds_check(bld, image, addr, dims);
>>>> +
>>>> +               /* and, phew, they don't know about surface coordinates, we
>>>> +                * need to convert them to a raw memory offset.
>>>> +                */
>>>> +               const fs_reg laddr = emit_address_calculation(bld, image, addr, dims);
>>>> +
>>>> +               emit_untyped_write(bld, image, laddr, tmp, 1, size, pred);
>>>> +            }
>>>> +         }
>>>> +      }
>>>> +
>>>> +      /**
>>>> +       * Perform an atomic read-modify-write operation in a surface of the
>>>> +       * given dimensionality at the given coordinates.  Main building block
>>>> +       * of the imageAtomic GLSL built-ins.
>>>> +       */
>>>> +      fs_reg
>>>> +      emit_image_atomic(const fs_builder &bld,
>>>> +                        const fs_reg &image, const fs_reg &addr,
>>>> +                        const fs_reg &src0, const fs_reg &src1,
>>>> +                        unsigned dims, unsigned rsize, unsigned op)
>>>> +      {
>>>> +         using namespace image_validity;
>>>> +         using namespace image_coordinates;
>>>> +         using namespace surface_access;
>>>> +         /* Avoid performing an atomic operation on an unbound surface. */
>>>> +         const brw_predicate pred = emit_surface_check(bld, image);
>>>> +
>>>> +         /* Thankfully we can do without untyped atomics here. */
>>>> +         const fs_reg tmp = emit_typed_atomic(bld, image, addr, src0, src1,
>>>> +                                              dims, rsize, op, pred);
>>>> +
>>>> +         /* An unbound surface access should give zero as result. */
>>>> +         if (rsize)
>>>> +            set_predicate(pred, bld.SEL(tmp, tmp, fs_reg(0)));
>>>> +
>>>> +         return tmp;
>>>> +      }
>>>> +   }
>>>> +}
>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.h b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.h
>>>> index 5424be6..cad8c27 100644
>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.h
>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.h
>>>> @@ -213,5 +213,22 @@ namespace brw {
>>>>                   _mesa_get_format_datatype(format) == GL_INT);
>>>>        }
>>>>     }
>>>> +
>>>> +   namespace image_access {
>>>> +      fs_reg
>>>> +      emit_image_load(const fs_builder &bld,
>>>> +                      const fs_reg &image, const fs_reg &addr,
>>>> +                      mesa_format format, unsigned dims);
>>>> +
>>>> +      void
>>>> +      emit_image_store(const fs_builder &bld, const fs_reg &image,
>>>> +                       const fs_reg &addr, const fs_reg &src,
>>>> +                       mesa_format format, unsigned dims);
>>>> +      fs_reg
>>>> +      emit_image_atomic(const fs_builder &bld,
>>>> +                        const fs_reg &image, const fs_reg &addr,
>>>> +                        const fs_reg &src0, const fs_reg &src1,
>>>> +                        unsigned dims, unsigned rsize, unsigned op);
>>>> +   }
>>>>  }
>>>>  #endif
>>>> --
>>>> 2.4.3
>>>>
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150731/36cd4851/attachment-0001.sig>


More information about the mesa-dev mailing list