[Intel-gfx] [PATCH 1/4] i915/gem: drop wbinvd_on_all_cpus usage

Mon Mar 21 17:42:03 UTC 2022

On 2022-03-21 10:28 a.m., Tvrtko Ursulin wrote:
>
> On 21/03/2022 16:31, Michael Cheng wrote:
>> On 2022-03-21 3:30 a.m., Tvrtko Ursulin wrote:
>>
>>>
>>> On 19/03/2022 19:42, Michael Cheng wrote:
>>>> Previous concern with using drm_clflush_sg was that we don't know 
>>>> what the
>>>> sg_table is pointing to, thus the usage of wbinvd_on_all_cpus to flush
>>>> everything at once to avoid paranoia.
>>>
>>> And now we know, or we know it is not a concern?
>>>
>>>> To make i915 more architecture-neutral and be less paranoid, lets 
>>>> attempt to
>>>
>>> "Lets attempt" as we don't know if this will work and/or what 
>>> can/will break?
>>
>> Yes, but it seems like there's no regression with IGT .
>>
>> If there's a big hit in performance, or if this solution gets 
>> accepted and the bug reports come flying in, we can explore other 
>> solutions. But speaking to Dan Vetter, ideal solution would be to 
>> avoid any calls directly to wbinvd, and use drm helpers in place.
>>
>> +Daniel for any extra input.
>>
>>>> use drm_clflush_sg to flush the pages for when the GPU wants to read
>>>> from main memory.
>>>>
>>>> Signed-off-by: Michael Cheng <michael.cheng at intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 ++-------
>>>>   1 file changed, 2 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
>>>> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>>> index f5062d0c6333..b0a5baaebc43 100644
>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>>> @@ -8,6 +8,7 @@
>>>>   #include <linux/highmem.h>
>>>>   #include <linux/dma-resv.h>
>>>>   #include <linux/module.h>
>>>> +#include <drm/drm_cache.h>
>>>>     #include <asm/smp.h>
>>>>   @@ -250,16 +251,10 @@ static int 
>>>> i915_gem_object_get_pages_dmabuf(struct drm_i915_gem_object *obj)
>>>>        * DG1 is special here since it still snoops transactions 
>>>> even with
>>>>        * CACHE_NONE. This is not the case with other HAS_SNOOP 
>>>> platforms. We
>>>>        * might need to revisit this as we add new discrete platforms.
>>>> -     *
>>>> -     * XXX: Consider doing a vmap flush or something, where possible.
>>>> -     * Currently we just do a heavy handed wbinvd_on_all_cpus() 
>>>> here since
>>>> -     * the underlying sg_table might not even point to struct 
>>>> pages, so we
>>>> -     * can't just call drm_clflush_sg or similar, like we do 
>>>> elsewhere in
>>>> -     * the driver.
>>>>        */
>>>>       if (i915_gem_object_can_bypass_llc(obj) ||
>>>>           (!HAS_LLC(i915) && !IS_DG1(i915)))
>>>> -        wbinvd_on_all_cpus();
>>>> +        drm_clflush_sg(pages);
>>>
>>> And as noticed before, drm_clfush_sg still can call 
>>> wbinvd_on_all_cpus so are you just punting the issue somewhere else? 
>>> How will it be solved there?
>>>
>> Instead of calling an x86 asm directly, we are using what's available 
>> to use to make the driver more architecture neutral. Agreeing with 
>> Thomas, this solution falls within the "prefer range-aware clflush 
>> apis", and since some other generation platform doesn't support 
>> clflushopt, it will fall back to using wbinvd.
>
> Right, I was trying to get the information on what will drm_clflush_sg 
> do on Arm. Is it range based or global there, or if the latter exists.
>
I am not too sure about the ARM side. We are currently working that out 
with the ARM folks in a different thread.
> Regards,
>
> Tvrtko