[igt-dev] [PATCH v3 1/3] lib/rendercopy: Add AUX page table support

Imre Deak imre.deak at intel.com
Wed Nov 6 19:04:10 UTC 2019


On Wed, Nov 06, 2019 at 05:02:30PM +0000, Chris Wilson wrote:
> Quoting Imre Deak (2019-11-06 16:36:49)
> > On Wed, Nov 06, 2019 at 04:14:36PM +0000, Chris Wilson wrote:
> > > Quoting Imre Deak (2019-11-06 16:00:16)
> > > > On Wed, Nov 06, 2019 at 11:11:40AM +0000, Chris Wilson wrote:
> > > > > What's the coherency model for the pgt->bo? There's a lot of writes here
> > > > > from the CPU how are you making sure they are visible to the GPU?
> > > > 
> > > > My understanding: execbuf will do a clflush for the buf on non-LLC
> > > > platforms, while on LLC platforms that's not needed; not sure if there
> > > > will be any non-LLC platforms where the AUX pagetable will be needed.
> > > 
> > > No, we don't unless you tell us you modified it between batches. That
> > > would be the case if you were using drm_intel_bo_map(true) everytime,
> > > but again that will then stall the GPU between every batch -- which we
> > > definitely do not want when testing [as it will hide every bug] :)
> > 
> > Do you see a problem with the way I program it here though? I call
> > drm_intel_bo_map(true) for the purpose of programming a new table. If
> > you see a problem with that what do you suggest instead?
> 
> You create a new pagetable for every batch. Good for solitary render
> copies, not so good for validation :( I expect we are invoking too many
> stalls to be able to detect missing flushes and invalidations.
> 
> I would not suggest this interface is suitable long term. There is a
> large leap between this and userspace, that I feel like we should be
> covering more of the stepping stones so that userspace can trust the
> layers it is built on [for some value of trust].

Yes, but the stall you mention should only happen on non-LLC platofrms.
At least on LLC we don't need to flush CPU caches, so it's not clear
what would cause the stall.

An API to create the pagetable at a higher level would be more efficient
in any case. Are you ok implementing that as a follow-up and starting
with the current approach?

> > > > On top of that the AUX GAM has its own caches - found that now in an HSD
> > > > doc - which will be invalidated whenever GAM's top level table base
> > > > pointer register is set from the context image, which happens whenver
> > > > switching to the context.
> > > 
> > > But not for a lite-restore?
> > 
> > We still write the top level table regsiter from the batch, so that
> > could in itself trigger the invalidation.
> 
> I would not expect that to be true for real clients. Or is it the only
> recommended programming model?

The spec doesn't detail programming models, it only describes the page
table format. I suppose programming the pgt pointer only once for a
context and then using CCS_AUX_INV later after entries have been
modified would be another way.

> > > So we have a problem when updating existing entries between batches
> > > inside the same context? We put a big hammer at the front of the
> > > request for TLB invalidation. If there's a flush we can add, we need
> > > it there.
> > 
> > Yes, there is, see the CCS_AUX_INV reg in bspec. But I suspect this
> > would have the same effect as writing the table ptr register, so not
> > sure when exactly this would be needed. 
> 
> But the top level base pointer will not be updated on a lite restore;
> are you sure the invalidation will occur if sublevels of the tt are
> updated?

No, that doesn't. So if the batch wouldn't write the top level table
register, we would need to invalidate the caches using CCS_AUX_INV.

> The good thing is that this is all virtual addresses, though there is a
> slight danger with using system pages if the TLB is not flushed before
> release (i.e. it can still access the page contents after someone else
> uses them).

You mean we'd need to invalidate the AUX pgt caches when releasing the
pages for a surface? Agreed that makes sense for robustness, even though
I can't see how writes could go astray if we invalidate the graphics
address mapping for the surface anyway.

--Imre


More information about the igt-dev mailing list