[Intel-gfx] [PATCH 17/19] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer

Chris Wilson chris at chris-wilson.co.uk
Sat Aug 20 08:40:54 UTC 2016


On Sat, Aug 20, 2016 at 12:14:46PM +0530, Goel, Akash wrote:
> 
> 
> On 8/19/2016 11:49 PM, Chris Wilson wrote:
> >On Fri, Aug 19, 2016 at 02:13:16PM +0530, akash.goel at intel.com wrote:
> >>From: Akash Goel <akash.goel at intel.com>
> >>
> >>In order to have fast reads from the GuC log buffer, used SSE4.1 movntdqa
> >>based memcpy function i915_memcpy_from_wc.
> >>GuC log buffer has a WC type vmalloc mapping and copying using movntqda
> >>from WC type memory is almost as fast as reading from WB memory.
> >>This will further reduce the log buffer sampling time, so is needed dearly
> >>to deal with the flush interrupt storm when GuC is generating logs at a
> >>very high rate.
> >>Ideally SSE 4.1 should be present on all chipsets supporting GuC based
> >>submisssions, but if not then logging will not be enabled.
> >>
> >>v2: Rebase.
> >>
> >>Suggested-by: Chris Wilson <chris at chris-wilson.co.uk>
> >>Signed-off-by: Akash Goel <akash.goel at intel.com>
> >>Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >
> >Should be squashed with patch 16 (use MAP_WC).
> Fine will squash, but please could you tell that what issue could be
> there with 2 patches being separate. Either both will be merged or
> none of them will be merged.

Further reflection is that the initial patch to map the log should be
with MAP_WC. Correctness first, then the performance patch.

As it stands, the story being told is:

1. Enable readback.
2. Oops, that isn't correct, better map it WC
3. Oops, that is too slow, better use movntqa.

A better story is

1. Enable readback, enforcing coherency.
2. Accelerate readback for !llc.

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list