[PATCH 4/4] drm/nouveau: introduce CPU cache flushing macro

Thierry Reding thierry.reding at gmail.com
Mon May 19 03:03:17 PDT 2014


On Mon, May 19, 2014 at 11:22:11AM +0200, Lucas Stach wrote:
> Am Montag, den 19.05.2014, 11:02 +0200 schrieb Thierry Reding:
> > On Mon, May 19, 2014 at 04:10:58PM +0900, Alexandre Courbot wrote:
> > > Some architectures (e.g. ARM) need the CPU buffers to be explicitely
> > > flushed for a memory write to take effect. Not doing so results in
> > > synchronization issues, especially after writing to BOs.
> > 
> > It seems to me that the above is generally true for all architectures,
> > not just ARM.
> > 
> No, on PCI coherent arches, like x86 and some PowerPCs, the GPU will
> snoop the CPU caches and therefore an explicit cache flush is not
> required.

I was criticizing the wording in the commit message. Perhaps it could be
enhanced with what you just said.

> > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
> > [...]
> > > index 0886f47e5244..b9c9729c5733 100644
> > > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> > > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> > > @@ -437,8 +437,10 @@ nouveau_bo_wr16(struct nouveau_bo *nvbo, unsigned index, u16 val)
> > >  	mem = &mem[index];
> > >  	if (is_iomem)
> > >  		iowrite16_native(val, (void __force __iomem *)mem);
> > > -	else
> > > +	else {
> > >  		*mem = val;
> > > +		nv_cpu_cache_flush_area(mem, 2);
> > > +	}
> > >  }
> > >  
> > >  u32
> > > @@ -461,8 +463,10 @@ nouveau_bo_wr32(struct nouveau_bo *nvbo, unsigned index, u32 val)
> > >  	mem = &mem[index];
> > >  	if (is_iomem)
> > >  		iowrite32_native(val, (void __force __iomem *)mem);
> > > -	else
> > > +	else {
> > >  		*mem = val;
> > > +		nv_cpu_cache_flush_area(mem, 4);
> > > +	}
> > 
> > This looks rather like a sledgehammer to me. Effectively this turns nvbo
> > into an uncached buffer. With additional overhead of constantly flushing
> > caches. Wouldn't it make more sense to locate the places where these are
> > called and flush the cache after all the writes have completed?
> > 
> I don't think the explicit flushing for those things makes sense. I
> think it is a lot more effective to just map the BOs write-combined on
> PCI non-coherent arches. This way any writes will be buffered. Reads
> will be slow, but I don't think nouveau is reading back a lot from those
> buffers.
> Using the write-combining buffer doesn't need any additional
> synchronization as it will get flushed on pushbuf kickoff anyways.

Sounds good to me.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140519/83d18f32/attachment.sig>


More information about the dri-devel mailing list