[Intel-gfx] [PATCH 0/4] Drop wbinvd_on_all_cpus usage

Mon Mar 21 14:43:14 UTC 2022

On 21/03/2022 13:40, Thomas Hellström wrote:
> Hi,
> 
> On Mon, 2022-03-21 at 13:12 +0000, Tvrtko Ursulin wrote:
>>
>> On 21/03/2022 12:33, Thomas Hellström wrote:
>>> On Mon, 2022-03-21 at 12:22 +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 21/03/2022 11:03, Thomas Hellström wrote:
>>>>> Hi, Tvrtko.
>>>>>
>>>>> On 3/21/22 11:27, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 19/03/2022 19:42, Michael Cheng wrote:
>>>>>>> To align with the discussion in [1][2], this patch series
>>>>>>> drops
>>>>>>> all
>>>>>>> usage of
>>>>>>> wbvind_on_all_cpus within i915 by either replacing the call
>>>>>>> with certain
>>>>>>> drm clflush helpers, or reverting to a previous logic.
>>>>>>
>>>>>> AFAIU, complaint from [1] was that it is wrong to provide non
>>>>>> x86
>>>>>> implementations under the wbinvd_on_all_cpus name. Instead an
>>>>>> arch
>>>>>> agnostic helper which achieves the same effect could be
>>>>>> created.
>>>>>> Does
>>>>>> Arm have such concept?
>>>>>
>>>>> I also understand Linus' email like we shouldn't leak incoherent
>>>>> IO
>>>>> to
>>>>> other architectures, meaning any remaining wbinvd()s should be
>>>>> X86
>>>>> only.
>>>>
>>>> The last part is completely obvious since it is a x86 instruction
>>>> name.
>>>
>>> Yeah, I meant the function implementing wbinvd() semantics.
>>>
>>>>
>>>> But I think we can't pick a solution until we know how the concept
>>>> maps
>>>> to Arm and that will also include seeing how the drm_clflush_sg for
>>>> Arm
>>>> would look. Is there a range based solution, or just a big hammer
>>>> there.
>>>> If the latter, then it is no good to churn all these reverts but
>>>> instead
>>>> an arch agnostic wrapper, with a generic name, would be the way to
>>>> go.
>>>
>>> But my impression was that ARM would not need the range-based
>>> interface
>>> either, because ARM is only for discrete and with discrete we're
>>> always
>>> coherent.
>>
>> Not sure what you mean here - what about flushing system memory objects
>> on discrete? Those still need flushing on paths like suspend which this
>> series touches. Am I missing something?
> 
> System bos on discrete should always have
> 
> I915_BO_CACHE_COHERENT_FOR_READ | I915_BO_CACHE_COHERENT_FOR_WRITE
> 
> either by the gpu being fully cache coherent (or us mapping system
> write-combined). Hence no need for cache clflushes or wbinvd() for
> incoherent IO.

Hmm so you are talking about the shmem ttm backend. It ends up depending on the result of i915_ttm_cache_level, yes? It cannot end up with I915_CACHE_NONE from that function?

I also found in i915_drm.h:

	 * As caching mode when specifying `I915_MMAP_OFFSET_FIXED`, WC or WB will
	 * be used, depending on the object placement on creation. WB will be used
	 * when the object can only exist in system memory, WC otherwise.

If what you say is true, that on discrete it is _always_ WC, then that needs updating as well.

> 
> That's adhering to Linus'
> 
> "And I sincerely hope to the gods that no cache-incoherent i915 mess
> ever makes it out of the x86 world. Incoherent IO was always a
> historical mistake and should never ever happen again, so we should
> not spread that horrific pattern around."

Sure, but I was not talking about IO - just the CPU side access to CPU side objects.

Regards,

Tvrtko