[Intel-gfx] [PATCH 21/42] drm/i915: Implement pwrite without struct-mutex
Chris Wilson
chris at chris-wilson.co.uk
Thu Oct 13 11:54:07 UTC 2016
On Thu, Oct 13, 2016 at 02:17:52PM +0300, Joonas Lahtinen wrote:
> On pe, 2016-10-07 at 10:46 +0100, Chris Wilson wrote:
> > +/* Per-page copy function for the shmem pwrite fastpath.
> > + * Flushes invalid cachelines before writing to the target if
> > + * needs_clflush_before is set and flushes out any written cachelines after
> > + * writing if needs_clflush is set.
> > + */
> > static int
> > -i915_gem_shmem_pwrite(struct drm_device *dev,
> > - struct drm_i915_gem_object *obj,
> > - struct drm_i915_gem_pwrite *args,
> > - struct drm_file *file)
> > +shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
> > + bool page_do_bit17_swizzling,
> > + bool needs_clflush_before,
> > + bool needs_clflush_after)
>
> I remember having complaints of two bool arguments in same func. Could
> these tree be fixed while mangling the code otherwise too? Or as a
> follow-up.
They are three different evaluations on each loop. Looks to be bit of a
hassle to amalgamate them into one in the caller. Though
(pages_to_phys() & BIT(17)) |
((offset | length) & 63) |
(needs_clflush & 2)
almost works (just need to choose another bit for needs_clflush). I'm
not convinced you'd like the unpacking in shmem_pwrite() ;)
Something like:
static int
shmem_pwrite_slow(struct page *page, int offset, int length,
char __user *user_data,
unsigned flags)
{
char *vaddr;
int ret;
vaddr = kmap(page);
if (unlikely(flags & (BIT(17) | 63)))
shmem_clflush_swizzled_range(vaddr + offset, length, flags);
if (flags & BIT(17))
ret = __copy_from_user_swizzled(vaddr, offset, user_data,
length);
else
ret = __copy_from_user(vaddr + offset, user_data, length);
if (flags & BIT(16))
shmem_clflush_swizzled_range(vaddr + offset, length, flags));
kunmap(page);
return ret ? -EFAULT : 0;
}
> > +static int
> > +i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj,
> > + const struct drm_i915_gem_pwrite *args)
> > +{
> > + void __user *user_data;
> > + u64 remain;
> > + unsigned int obj_do_bit17_swizzling;
> > + unsigned int partial_cacheline_write;
>
> partial_cacheline_mask might be more descriptive?
To me that says we want to mask off the partial cacheline, or use them
constructively. Imho,
len & partial_cacheline_write
reads more clearly as a boolean question than
len & partial_cacheline_mask.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list