[Mesa-dev] [PATCH v2 2/5] panfrost: Allocate shaders in their own BOs

Fri Aug 9 17:53:51 UTC 2019

On Fri, Aug 9, 2019 at 3:31 AM Tomeu Vizoso <tomeu.vizoso at collabora.com> wrote:
>
> On Thu, 8 Aug 2019 at 16:19, Rob Herring <robh at kernel.org> wrote:
> >
> > On Wed, Aug 7, 2019 at 11:23 PM Tomeu Vizoso <tomeu.vizoso at collabora.com> wrote:
> > >
> > > On Thu, 8 Aug 2019 at 00:47, Rob Herring <robh at kernel.org> wrote:
> > > >
> > > > On Wed, Aug 7, 2019 at 2:37 AM Tomeu Vizoso <tomeu.vizoso at collabora.com> wrote:
> > > > >
> > > > > Instead of all shaders being stored in a single BO, have each shader in
> > > > > its own.
> > > > >
> > > > > This removes the need for a 16MB allocation per context, and allows us
> > > > > to place transient blend shaders in BOs marked as executable (before
> > > > > they were allocated in the transient pool, which shouldn't be
> > > > > executable).
> > > > >
> > > > > v2: - Store compiled blend shaders in a malloc'ed buffer, to avoid
> > > > >       reading from GPU-accessible memory when patching (Alyssa).
> > > > >     - Free struct panfrost_blend_shader (Alyssa).
> > > > >     - Give the job a reference to regular shaders when emitting
> > > > >       (Alyssa).
> > > > >
> > > > > Signed-off-by: Tomeu Vizoso <tomeu.vizoso at collabora.com>
> > > >
> > > >
> > > > > diff --git a/src/gallium/drivers/panfrost/pan_bo_cache.c b/src/gallium/drivers/panfrost/pan_bo_cache.c
> > > > > index fba495c1dd69..7378d0a8abea 100644
> > > > > --- a/src/gallium/drivers/panfrost/pan_bo_cache.c
> > > > > +++ b/src/gallium/drivers/panfrost/pan_bo_cache.c
> > > > > @@ -84,11 +84,10 @@ panfrost_bo_cache_fetch(
> > > > >  {
> > > > >          struct list_head *bucket = pan_bucket(screen, size);
> > > > >
> > > > > -        /* TODO: Honour flags? */
> > > > > -
> > > > >          /* Iterate the bucket looking for something suitable */
> > > > >          list_for_each_entry_safe(struct panfrost_bo, entry, bucket, link) {
> > > > > -                if (entry->size >= size) {
> > > > > +                if (entry->size >= size &&
> > > > > +                    entry->flags == flags) {
> > > >
> > > > This change probably warrants its own patch IMO.
> > >
> > > Agreed.
> > >
> > > > This is using the
> > > > untranslated flags, but I think it should be the 'translated_flags' as
> > > > those are the ones changing the allocation. For example, I don't think
> > > > there's any reason for DELAYED_MMAP to be used as a match criteria
> > > > (BTW, I'm also not sure if we can reclaim BOs if they remain mmap'ed).
> > > >
> > > > Another problem I see, if we have a 100MB buffer in the cache, would
> > > > we really want to hit on a 4KB allocation? Perhaps a 'entry->size * 2
> > > > < size' check.
> > >
> > > Yeah, as mentioned in the v1 discussion, we have plenty of room for
> > > improvements here, but the goal now is just to stop doing memory
> > > allocation so greedily that we reach OOM after launching a few GL
> > > clients.
> >
> > Sure. IMO, committing the BO cache without madvise was a mistake.
> > Without madvise, 2 instances of glmark will OOM.
>
> How can I test that? I just checked here and I'm running 10 instances
> of it within gnome-shell with 1GB still free (from a total of 2GB).
> This is with HEAP support, without it we'd be still allocating one
> 16MB buffer per context, but it's still not that bad.
>
> It used to be pretty bad when we were allocating gigantic buffers on
> context creation, just to be safe. But Mesa master now is much more
> careful with that and I think .
>
> > I should be able to
> > send out the patch for it today. I think it's going to need to disable
> > caching when madvise is not supported.
>
> Can you check if that would be still needed, please?

It definitely seems better now.

I think before I also had lots of debug dmsg going to disk and filling
the page cache. I tried reproducing that, but still seems okay now.

Rob