[Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

Sun Sep 22 12:49:07 UTC 2019

+your collabora address

On Sun, 22 Sep 2019 08:31:40 -0400
Alyssa Rosenzweig <alyssa at rosenzweig.io> wrote:

> > > Although actually I am not at all sure what this batch_add_bo is doing
> > > at all?
> > > 
> > > I think this batch_add_bo should probably dropped altogether? This loop
> > > is dealing with constant buffers; the shaders themselves were added  
> > 
> > I'll double check. I couldn't find where BOs containing shader programs
> > were added last time I looked.  
> 
> Masking a real bug :o
> 
> It should probably happen in panfrost_patch_shader_state....?

Ok, I'll add it there, I wasn't sure this function was called for all
shaders, but looking at the code a second time it seems to be the case.

> 
> > > >  void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch)
> > > >  {
> > > > +        uint32_t flags = PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_WRITE |
> > > > +                         PAN_BO_ACCESS_VERTEX_TILER |
> > > > +                         PAN_BO_ACCESS_FRAGMENT;    
> > > 
> > > I think we can drop VERTEX_TILER here...? The buffers are written right
> > > at the end of the FRAGMENT job, not touched before that.  
> > 
> > What about the read done when drawing the wallpaper? I guess it's also
> > only read by the fragment job, but I wasn't sure.  
> 
> As I stated before, I thought we should be adding the BO for
> wallpapering when we actually wallpaper, since that's a slow path. Not
> wallpapering is the default and ideally what most apps should do.

Wallpapering happens too late (when we are flushing the batch) to have
an impact on the dep graph, but we can probably know that wallpapering
will be needed before that. My question remains though, are
vertex/tiler supposed to touch the texture BO they're reading from, or
should we only flag the BO for FRAGMENT use.

> 
> > > If nothing else is broken, this should allow a nice perf boost with
> > > pipelining, so the vertex/tiler from frame n+1 can run in parallel with
> > > the fragment of frame n (rather than blocking on frame n finishing with
> > > the FBOs).  
> > 
> > Would require the kernel patches I posted earlier for that to
> > happen ;-). Right now all jobs touching the same BO are serialized
> > because of the implicit BO fences added by the kernel driver.  
> 
> Sure~ Maybe this sort of bug was the reason you weren't seeing
> improvement from those kernel patches?

Maybe.