[Intel-gfx] [PATCH 02/11] drm/i915: Avoid forcing relocations through the mappable GTT or CPU

Daniel Vetter daniel at ffwll.ch
Mon Jan 14 21:53:14 CET 2013


On Mon, Jan 14, 2013 at 9:50 PM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> On Mon, 14 Jan 2013 21:04:06 +0100, Daniel Vetter <daniel at ffwll.ch> wrote:
>> On Mon, Jan 14, 2013 at 8:21 PM, Imre Deak <imre.deak at intel.com> wrote:
>> >> +                     drm_clflush_virt_range(vaddr + page_offset, 4);
>> >> +                     *(uint32_t *)(vaddr + page_offset) = reloc->delta;
>> >> +                     drm_clflush_virt_range(vaddr + page_offset, 4);
>> >
>> > Discussed this already to some degree with Chris, but I still think the
>> > first cache flush is redundant.
>>
>> Nope, since it's a partial cacheline write, we need to first flush out
>> any stale data, then write the dword (which loads the latest data from
>> memory into a cacheline). Then we need to flush the updated cacheline
>> out into main memory again. Iirc I've mentioned this somewhere in the
>> part 4 of my gem intro.
>
> The question was whether there could be any stale data in that aliased
> cacheline, and whether or not the code was a continuation of a cargo
> cult. My answer was that it was our working knowledge that that flush
> is crucial to unconfuse the CPU. In the absence of a definitive
> reference for the mysteries of clflush, we should write one.

Iirc for pwrite, the clflush _before_ a partial cacheline write is
crucial for correctness. At least I remember writing tests for it,
which successfully blow up if you drop that flush ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list