[Mesa-dev] [PATCH 00/10] glsl: Implement varying packing.
stereotype441 at gmail.com
Wed Dec 12 13:23:33 PST 2012
On 12 December 2012 12:53, Marek Olšák <maraeo at gmail.com> wrote:
> On Wed, Dec 12, 2012 at 9:21 PM, Eric Anholt <eric at anholt.net> wrote:
> > Marek Olšák <maraeo at gmail.com> writes:
> >> On Wed, Dec 12, 2012 at 5:06 PM, Paul Berry <stereotype441 at gmail.com>
> >>> On 11 December 2012 23:49, Aras Pranckevicius <aras at unity3d.com>
> >>>> Not sure if relevant for Mesa, but e.g. on PowerVR SGX it's really
> bad to
> >>>> pack two vec2 texture coordinates into a single vec4. That's because
> >>>> texture read can be "prefetched", whereas var.zw texture read is not
> >>>> prefetched (essentially treated as a dependent texture read), and
> >>>> causes stalls in the shader execution.
> >>> Interesting--I had not thought of that possibility. On i965 all
> >>> reads have to be done explicitly by the fragment shader (there is no
> >>> prefetching IIRC), so this penalty doesn't apply. Does anyone know if
> >>> penalty like this exists in any of Mesa's other back-ends? If so that
> >>> suggest some good experiments to try. I'm open to revising my opinion
> >>> someone measures a significant performance degradation, particularly
> with a
> >>> real-world app.
> >> R300 and R400 support 4 texture indirections (as defined by
> >> ARB_fragment_program). Adding ALU instructions before the first TEX
> >> instruction increases the number of texture indirections by 1, which
> >> might make some shaders not be executable on the hardware at all.
> >> I think this optimization should be disabled on drivers where the
> >> texture indirection limit is too low.
> > And are swizzles of texcoords required to be separate MOVs beforehand
> > (like on i915)?
> Yes, swizzles aren't supported by the TEX instruction and must be
> lowered. And the lowering sucks, because the only supported 3D source
> operand swizzles are .xxx, .yyy, .zzz, .www, .yzw, .zxy, .wzy, .111,
> .000, and 0.HHH (H=0.5), so the swizzle can occupy up to 3 MOV
> instructions. The 4th channel is handled by a separate scalar
> instruction, which is independent of the 3D instruction. (R300 can
> execute one 3D and one scalar instruction simultaneously)
Ok, unless I hear objections, I'll rework the patch series so that the
driver can opt out of varying packing (e.g. by setting
Const.DisableVaryingPacking or some such). I'll add an assertion to verify
that drivers that opt out of varying packing don't support transform
feedback (so that we don't have to go to extra work to support transform
feedback of both packed and unpacked varyings).
I don't expect the re-work to change too many things, so feel free to
review the patch series as-is and I'll fold your review into v2 when I get
to it (probably in the next day or two).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mesa-dev