[Mesa-dev] [PATCH 7/7] i965: Drop non-LLC lunacy in the program cache code.

Emil Velikov emil.l.velikov at gmail.com
Mon Jul 24 16:43:27 UTC 2017


On 24 July 2017 at 15:51, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Monday, July 24, 2017 3:54:11 AM PDT Emil Velikov wrote:
>> Hi Ken,
>>
>> Admittedly I'm not an expert in the area, so perhaps a rather silly question.
>>
>> On 22 July 2017 at 00:17, Kenneth Graunke <kenneth at whitecape.org> wrote:
>>
>> > +#ifdef USE_SSE41
>> > +      if (!cache->bo->cache_coherent && cpu_has_sse4_1)
>> > +         _mesa_streaming_load_memcpy(map, cache->map, cache->next_offset);
>> > +      else
>> > +#endif
>> > +         memcpy(map, cache->map, cache->next_offset);
>> The other user of _mesa_streaming_load_memcpy -
>> intel_miptree_map/intel_miptree_map_movntdqa does not seem to check
>> for the coherency flag.
>>
>> Which makes me wonder:
>> Did you intentionally combine the SSE4.1 check with the
>> !cache_coherent one, should there be a similar check in the miptree
>> code or the two cases are orthogonal?
>>
>> Thanks
>> Emil
>>
>
> The other code uses brw->has_llc.  Basically, on LLC platforms, all
> buffers other than scanout are coherent.  On non-LLC, almost all buffers
> are non-coherent.  We originally didn't have a bo->cache_coherent flag,
> and used brw->has_llc as the distinguishing factor.
>
> On non-LLC, you can make buffers coherent by enabling snooping (but it's
> expensive).  We haven't ever done that yet, though Chris has patches to
> do so for query object buffers, where we want the CPU and GPU to be able
> to read an "I'm done!" flag for CheckQuery.
>
> So, it would probably be reasonable to change intel_miptree_map to use
> bo->cache_coherent instead of brw->has_llc, though this is unlikely to
> matter in practice since snooping for textures doesn't make much sense,
> and on LLC systems, we're probably not going to map the scanout buffer.
>
> MOVNTDQA gives faster streaming read performance when sourcing from
> uncached memory, apparently.  Non-coherent BOs bypass the CPU caches,
> so we want to use it there.
>
>
Thank you very much for the comprehensive answer Ken.

Seems like I've missed the brw->has_llc piece in use_intel_mipree_map_blit().
Reading through the following page's [1] "processor implementation may
..." did not help much either ;-)

-Emil

[1] http://www.felixcloutier.com/x86/MOVNTDQA.html


More information about the mesa-dev mailing list