[Mesa-dev] ipers performance regression (Was: u_upload_mgr changes)

Mon Jan 4 06:03:06 PST 2016

On Mon, Jan 4, 2016 at 2:09 PM, Jose Fonseca <jfonseca at vmware.com> wrote:
> On 04/01/16 12:25, Jose Fonseca wrote:
>>
>> On 21/12/15 22:35, Marek Olšák wrote:
>>>
>>> Hi,
>>>
>>> This patch series adds more flexibility to u_upload_mgr. First, it
>>> adds the ability to specify the alignment per suballocation. The idea
>>> is that several users can use the same upload buffer, but each may
>>> need a different alignment. Finally, it allows specifying PIPE_USAGE,
>>> which usually affects memory placement.
>>>
>>> Please review.
>>>
>>> Marek
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>>
>> Hi Marek,
>>
>> This patch series, or the commit
>>
>>
>>
>> http://cgit.freedesktop.org/mesa/mesa/commit/?id=36c93a6fae275614b6004ec5ab085774d527e1bc
>
>
> I bisected and it turns out that 36c93a6fae275614b6004ec5ab085774d527e1bc is
> the bad commit, and not the u_upload_mgr series.
>
> In particular the issue steams from the hunk:
> +
> +   /* We uploaded modified constants, need to invalidate them. */
> +   st->dirty.mesa |= _NEW_PROGRAM_CONSTANTS;
>
> I suspect that drawing the text via glBitmap suddently became a huge hotspot
> becuase of this.  If one disables the help text (pressing h) there's no
> performance regression (frame rate is the same).
>
>
>
> The need for the constants is explained above:
>
>   /* As an optimization, Mesa's fragment programs will sometimes get the
>       * primary color from a statevar/constant rather than a varying
> variable.
>       * when that's the case, we need to ensure that we use the 'color'
>       * parameter and not the current attribute color (which may have
> changed
>       * through glRasterPos and state validation.
>       * So, we force the proper color here.  Not elegant, but it works.
>       */
>      {
>         GLfloat colorSave[4];
>         COPY_4V(colorSave, ctx->Current.Attrib[VERT_ATTRIB_COLOR0]);
>         COPY_4V(ctx->Current.Attrib[VERT_ATTRIB_COLOR0], color);
>         st_upload_constants(st, st->fp->Base.Base.Parameters,
>                             PIPE_SHADER_FRAGMENT);
>         COPY_4V(ctx->Current.Attrib[VERT_ATTRIB_COLOR0], colorSave);
>      }
>
> I wonder if putting colors in constants as opposed to vertex buffer with
> zero stride or an INT_MAX instancing divisor is really a good idea.

It's a good idea from the GPU perspective.

Using a vertex buffer is:
- one load instruction per thread in vertex shaders (64 per thread
group, though the cache should hide 63 of them in theory)
- the vertex shader output must be written (64 per thread group), so
on-chip memory must be allocated, which reduces parallelism just like
register usage reduces parallelism
- the pixel shader must load and interpolate the input (interpolation
= 3 on-chip vec4 loads per thread, 2 MAD instructions per thread, all
multiplied by 64)

Using a constant buffer is:
- 1 load instruction per entire pixel shader thread group on GCN,
because the constant load executes on a separate scalar unit and not
on SIMD units. This is best for performance and low power.

> Anyway, if this only affects llvmpipe, no biggie, but if this change makes
> text rendering via glBitmap much slower for all HW drivers, then we should
> probably revisit this.

IIRC, ipers is CPU-bound, so any GPU improvement won't be seen here,
but any additional overhead in st/mesa will hurt. Also, do we have
know any real apps using glBitmap?

Marek