[igt-dev] [PATCH v3 1/3] lib/rendercopy: Add AUX page table support

Chris Wilson chris at chris-wilson.co.uk
Wed Nov 6 21:25:03 UTC 2019


Quoting Imre Deak (2019-11-06 19:04:10)
> On Wed, Nov 06, 2019 at 05:02:30PM +0000, Chris Wilson wrote:
> > Quoting Imre Deak (2019-11-06 16:36:49)
> > > On Wed, Nov 06, 2019 at 04:14:36PM +0000, Chris Wilson wrote:
> > > > Quoting Imre Deak (2019-11-06 16:00:16)
> > > > > On Wed, Nov 06, 2019 at 11:11:40AM +0000, Chris Wilson wrote:
> > > > > > What's the coherency model for the pgt->bo? There's a lot of writes here
> > > > > > from the CPU how are you making sure they are visible to the GPU?
> > > > > 
> > > > > My understanding: execbuf will do a clflush for the buf on non-LLC
> > > > > platforms, while on LLC platforms that's not needed; not sure if there
> > > > > will be any non-LLC platforms where the AUX pagetable will be needed.
> > > > 
> > > > No, we don't unless you tell us you modified it between batches. That
> > > > would be the case if you were using drm_intel_bo_map(true) everytime,
> > > > but again that will then stall the GPU between every batch -- which we
> > > > definitely do not want when testing [as it will hide every bug] :)
> > > 
> > > Do you see a problem with the way I program it here though? I call
> > > drm_intel_bo_map(true) for the purpose of programming a new table. If
> > > you see a problem with that what do you suggest instead?
> > 
> > You create a new pagetable for every batch. Good for solitary render
> > copies, not so good for validation :( I expect we are invoking too many
> > stalls to be able to detect missing flushes and invalidations.
> > 
> > I would not suggest this interface is suitable long term. There is a
> > large leap between this and userspace, that I feel like we should be
> > covering more of the stepping stones so that userspace can trust the
> > layers it is built on [for some value of trust].
> 
> Yes, but the stall you mention should only happen on non-LLC platofrms.
> At least on LLC we don't need to flush CPU caches, so it's not clear
> what would cause the stall.

Even LLC would be stalled due to the sync point inside the
drm_intel_bo_map(). If we were reusing page tables between batches...

> An API to create the pagetable at a higher level would be more efficient
> in any case. Are you ok implementing that as a follow-up and starting
> with the current approach?

I'm ok with accepting this as proof we can do end-to-end compression
onto scanout. But there's plenty of work if we want to try and break the
kernel :-p

> > > > > On top of that the AUX GAM has its own caches - found that now in an HSD
> > > > > doc - which will be invalidated whenever GAM's top level table base
> > > > > pointer register is set from the context image, which happens whenver
> > > > > switching to the context.
> > > > 
> > > > But not for a lite-restore?
> > > 
> > > We still write the top level table regsiter from the batch, so that
> > > could in itself trigger the invalidation.
> > 
> > I would not expect that to be true for real clients. Or is it the only
> > recommended programming model?
> 
> The spec doesn't detail programming models, it only describes the page
> table format. I suppose programming the pgt pointer only once for a
> context and then using CCS_AUX_INV later after entries have been
> modified would be another way.
> 
> > > > So we have a problem when updating existing entries between batches
> > > > inside the same context? We put a big hammer at the front of the
> > > > request for TLB invalidation. If there's a flush we can add, we need
> > > > it there.
> > > 
> > > Yes, there is, see the CCS_AUX_INV reg in bspec. But I suspect this
> > > would have the same effect as writing the table ptr register, so not
> > > sure when exactly this would be needed. 
> > 
> > But the top level base pointer will not be updated on a lite restore;
> > are you sure the invalidation will occur if sublevels of the tt are
> > updated?
> 
> No, that doesn't. So if the batch wouldn't write the top level table
> register, we would need to invalidate the caches using CCS_AUX_INV.

I think we're good leaving it to userspace poking around in its own page
tables to invalidate as it needs, except for the caveat below... (Which
is outside the flow of batches)
 
> > The good thing is that this is all virtual addresses, though there is a
> > slight danger with using system pages if the TLB is not flushed before
> > release (i.e. it can still access the page contents after someone else
> > uses them).
> 
> You mean we'd need to invalidate the AUX pgt caches when releasing the
> pages for a surface? Agreed that makes sense for robustness, even though
> I can't see how writes could go astray if we invalidate the graphics
> address mapping for the surface anyway.

* mutters

(The issue is that we don't invalidate before the surface goes away, so
there is^W may be a window of opportunity for peeking and poking.)
-Chris


More information about the igt-dev mailing list