[Mesa-dev] [PATCH 2/5] panfrost: Allocate shaders in their own BOs

Alyssa Rosenzweig alyssa at rosenzweig.io
Mon Aug 5 16:55:46 UTC 2019


> +        for (unsigned c = 0; c < 4; ++c) {
> +                struct panfrost_blend_rt *rt = &blend->rt[c];
> +                _mesa_hash_table_u64_clear(rt->shaders, panfrost_delete_blend_shader);
> +        }

What's the implication  of the clear call? Does that recursively free
the blend CSOs due to ralloc infrastructure?

> +                memcpy(final.shader.bo->cpu, shader->bo->cpu, shader->size);

I'm not totally comfortable with reading off CPU-writeable GPU-mapped
memory. We do it for scoreboarding too (which is probably a bug). But as
discussed, this may have performance and/or security implications that
we don't really understand. Maybe we might be better off having a purely
CPU copy of shaders we'll need to patch?

> +                    entry->flags == flags) {

This is definitely necessary, +1

I'm wondering if we should be optimizing this somehow but we can sail
that boat when we get there, I guess.

> @@ -2035,6 +2041,7 @@ panfrost_bind_shader_state(
>          enum pipe_shader_type type)
>  {
>          struct panfrost_context *ctx = pan_context(pctx);
> +        struct panfrost_job *job = panfrost_get_job_for_fbo(ctx);
>  
>          ctx->shader[type] = hwcso;
>  
> @@ -2098,6 +2105,8 @@ panfrost_bind_shader_state(
>  
>                  shader_state->compiled = true;
>          }
> +
> +        panfrost_job_add_bo(job, shader_state->bo);

This doesn't strike me as quite right. Conceptually, a shader state
could be bound on one frame and then just reused for 2000 frames in a
row, no? I guess that's not something you'll hit in CTS / normal apps,
but it still strikes me as something we need to pay attention to.

I think we want the add_bo() call to be part of emit_for_draw().

> +        /* To maximize BO cache usage, don't allocate tiny BOs */
> +        size = MAX2(size, 4096);

FWIW I think we round up to 4k anyway elsewhere? Also, how does this
interact with shaders -- if you have a bunch of tiny shaders of 128
bytes each, you'll be wasting 3968 bytes per shader, no? At least the
old strategy  (with the 16MB slab) allowed subdivision with byte
granularity.


More information about the mesa-dev mailing list