[Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
Mika Kuoppala
mika.kuoppala at linux.intel.com
Fri Mar 10 10:42:34 UTC 2017
Chris Wilson <chris at chris-wilson.co.uk> writes:
> On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote:
>> Chris Wilson <chris at chris-wilson.co.uk> writes:
>>
>> > On Fri, Mar 10, 2017 at 10:04:16AM +0000, Chris Wilson wrote:
>> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
>> >> > Chris Wilson <chris at chris-wilson.co.uk> writes:
>> >> >
>> >> > > We only need to clflush those cachelines that we have validated to be
>> >> > > read by the GPU. Userspace typically fills the batch length in
>> >> > > correctly, the exceptions tend to be explicit tests within igt.
>> >> > >
>> >> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> >> > > ---
>> >> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
>> >> > > 1 file changed, 2 insertions(+), 1 deletion(-)
>> >> > >
>> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> >> > > index 21b1cd917d81..b9ce9a6881ea 100644
>> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
>> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
>> >> > > }
>> >> > >
>> >> > > if (ret == 0 && needs_clflush_after)
>> >> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
>> >> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
>> >> > > + (void *)cmd - shadow_batch_obj->mm.mapping);
>> >> >
>> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
>> >> >
>> >> > We get away as the wb mapping being zero but for correctness.
>> >>
>> >> The low bits of mm.mapping cannot change the cacheline and so doesn't
>> >> affect the clflush. Same as before.
>> >
>> > Although, not quite the same as before as before we didn't use a fixed
>> > end-point and so it could overshoot by a cacheline, even across a page
>> > boundary.
>>
>> And there could be PAGE_MASK worth of lowbits.
>
> There can? Wasn't anticipating adding a few thousand memory types, and
> then sharing a single cache. :-p
#define ptr_mask_bits(ptr) ({
\
unsigned long __v = (unsigned long)(ptr);
\
(typeof(ptr))(__v & PAGE_MASK);
\
})
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list