[Intel-gfx] [PATCH] drm/i915/guc/slpc: remove unneeded clflush calls

Lucas De Marchi lucas.demarchi at intel.com
Thu Sep 23 05:37:03 UTC 2021


On Tue, Sep 21, 2021 at 04:06:00PM +0300, Ville Syrjälä wrote:
>On Mon, Sep 20, 2021 at 10:47:08PM -0700, Lucas De Marchi wrote:
>> On Wed, Sep 15, 2021 at 12:29:12PM -0700, John Harrison wrote:
>> >On 9/15/2021 12:24, Belgaumkar, Vinay wrote:
>> >>On 9/14/2021 12:51 PM, Lucas De Marchi wrote:
>> >>>The clflush calls here aren't doing anything since we are not writting
>> >>>something and flushing the cache lines to be visible to GuC. Here the
>> >>>intention seems to be to make sure whatever GuC has written is visible
>> >>>to the CPU before we read them. However a clflush from the CPU side is
>> >>>the wrong instruction to use.
>> >Is there a right instruction to use? Either we need to verify that no
>>
>> how can there be a right instruction? If the GuC needs to flush, then
>> the GuC needs to do it, nothing to be done by the CPU.
>>
>> Flushing the CPU cache line here is doing nothing to guarantee that what
>> was written by GuC hit the memory and we are reading it. Not sure why it
>> was actually added, but since it was added by Vinay and he reviewed this
>> patch, I'm assuming he also agrees
>
>clflush == writeback + invalidate. The invalidate is the important part
>when the CPU has to read something written by something else that's not
>cache coherent.

Although the invalidate would be the important part, how would that work
if there is still a flush? Wouldn't we be overriding whatever
was written by the other side? Or are we using the fact that we
shouldn't be writting to this cacheline so we know it's not dirty?

>
>Now, I have no idea if the guc has its own (CPU invisible) caches or not.
>If it does then it will need to trigger a writeback. But regardless, if
>the guc bypasses the CPU caches the CPU will need to invalidate before
>it reads anything in case it has stale data sitting in its cache.

Indeed, thanks... but another case would be if caches are coherent
through snoop.  Do you know what is the cache architecture with GuC
and CPU?

Another question comes to mind, but first some context: I'm looking
at this in order to support other archs besides x86... the only
platforms in which this would be relevant would be on the discrete ones
(I'm currently running an arm64 guest on qemu and using pci
passthrough). I see that for dgfx intel_guc_allocate_vma() uses
i915_gem_object_create_lmem() instead of i915_gem_object_create_shmem().
Would that make a difference?

thanks
Lucas De Marchi


More information about the Intel-gfx mailing list