[Mesa-dev] mediump support: future work

Tue May 5 00:08:59 UTC 2020

16-bit varyings only make sense if they are packed, i.e. we need to fit 2
16-bit 4D varyings into 1 vec4 slot to save memory for IO. Without that,
AMD (and most others?) won't benefit from 16-bit IO much.

16-bit uniforms would help everybody, because there is potential for
uniform packing, saving memory (and cache lines).

The other items are just for eliminating conversion instructions. We must
have more vectorized 16-bit vec2 instructions than "conversion
instructions + vec2 packing instructions" for mediump to pay off. We also
don't get decreased register usage if we are not vectorized, so mediump is
a tough sell at the moment.

Marek

On Mon, May 4, 2020 at 7:03 PM Rob Clark <robdclark at gmail.com> wrote:

> On Mon, May 4, 2020 at 11:44 AM Marek Olšák <maraeo at gmail.com> wrote:
> >
> > Hi,
> >
> > This is the status of mediump support in Mesa. What I listed is what AMD
> GPUs can do. "Yes" means what Mesa supports.
> >
> > Feature FP16 support Int16 support
> > ALU Yes No
> > Uniforms No No
> > VS in No No
> > VS out / FS in No No
> > FS out No No
> > TCS, TES, GS out / in No No
> > Sampler coordinates (only coord, derivs, lod, bias; not offset and
> compare) No ---
> > Image coordinates --- No
> > Return value from samplers (incl. sampler buffers) Yes
> > No
> > Return value from image loads (incl. image buffers) No No
> > Data source for image stores (incl. image buffers) No No
> > If 16-bit sampler/image instructions are surrounded by conversions,
> promote them to 32 bits No No
> >
> > Please let me know if you don't see the table correctly.
> >
> > I'd like to know if I can enable some of them using the existing FP16
> CAP. The only drivers supporting FP16 are currently Freedreno and Panfrost.
> >
>
> I think in general it should be ok.
>
> I think for ir3 we want 32b inputs/outputs for geom stages
> (vs/hs/ds/gs).  For frag outs we use nir_lower_mediump_outputs.. maybe
> this is a good approach to continue, to use a simple nir lowering pass
> for cases where a shader stage can directly take 16b input/output.
> For frag inputs we fold the narrowing conversion in to the varying
> fetch instruction in backend.
>
> int16 would be pretty useful, for loop counters especially.. these can
> have a long live-range and currently wastefully occupy a full 32b reg.
>
> Uniforms we haven't cared too much about, since we can (usually) read
> a 32b uniform as a 16b and fold that directly into alu instructions..
> we handle that in the backend.
>
> Pushing mediump support further would be great, and we can definitely
> help if it ends up needing changes in freedreno backend.  The deqp
> coverage in CI should give us pretty good confidence about whether or
> not we are breaking things in the ir3 backend.
>
> BR,
> -R
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20200504/12cb5043/attachment.htm>