[Intel-gfx] [PATCH] drm: Explicitly compute the last cacheline for clflush on range
Chris Wilson
chris at chris-wilson.co.uk
Sun Oct 18 05:28:11 PDT 2015
On Sat, Oct 17, 2015 at 11:03:19PM +0300, Imre Deak wrote:
> On Fri, 2015-10-16 at 20:55 +0100, Chris Wilson wrote:
> > Fixes regression from
> >
> > commit afcd950cafea6e27b739fe7772cbbeed37d05b8b
> > Author: Chris Wilson <chris at chris-wilson.co.uk>
> > Date: Wed Jun 10 15:58:01 2015 +0100
> >
> > drm: Avoid the double clflush on the last cache line in drm_clflush_virt_range()
> >
> > I'm stumped. Looking at the loop we should be iterating over every cache
> > line until we reach the start of the cacheline after the end of the
> > virtual range. Evidence says otherwise.
> >
> > More bizarely, I stored the last address to be clflushed and found it to
> > be equal to the start of the cacheline containing the last byte. Doubly
> > purplexed.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92501
> > Testcase: gem_tiled_partial_pwrite_pread/reads
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Imre Deak <imre.deak at intel.com>
> > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > ---
> > drivers/gpu/drm/drm_cache.c | 9 ++++++---
> > 1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
> > index 6743ff7dccfa..7c909bc8b68a 100644
> > --- a/drivers/gpu/drm/drm_cache.c
> > +++ b/drivers/gpu/drm/drm_cache.c
> > @@ -131,10 +131,13 @@ drm_clflush_virt_range(void *addr, unsigned long length)
> > #if defined(CONFIG_X86)
> > if (cpu_has_clflush) {
> > const int size = boot_cpu_data.x86_clflush_size;
> > - void *end = addr + length;
> > - addr = (void *)(((unsigned long)addr) & -size);
> > + void *end;
> > +
> > + end = (void *)(((unsigned long)addr + length - 1) & -size);
> > + addr = (void *)((unsigned long)addr & -size);
> > +
> > mb();
> > - for (; addr < end; addr += size)
> > + for (; addr <= end; addr += size)
>
> Hm, I can't see how could this make any difference. The old way still
> looks ok to me and the new version would flush the exact same cache
> lines as the old one using the same addresses (beginning of each cache
> line).
I couldn't spot the difference either. I am beginning to suspect it is
gcc as
diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 6743ff7..c9097b5 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -130,11 +130,11 @@ drm_clflush_virt_range(void *addr, unsigned long length)
{
#if defined(CONFIG_X86)
if (cpu_has_clflush) {
const int size = boot_cpu_data.x86_clflush_size;
- void *end = addr + length;
+ void *end = addr + length - 1;
addr = (void *)(((unsigned long)addr) & -size);
mb();
- for (; addr < end; addr += size)
+ for (; addr <= end; addr += size)
clflushopt(addr);
mb();
return;
Also fixes gem_tiled_partial_pwrite (on byt and bsw).
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list