[Intel-gfx] [PATCH v2] drm/i915/dpt: Use shmem for dpt objects

Fri Jul 21 08:17:13 UTC 2023

On 20/07/2023 18:02, Sripada, Radhakrishna wrote:
> Hi Tvrtko,
> 
>> -----Original Message-----
>> From: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>> Sent: Thursday, July 20, 2023 2:17 AM
>> To: Yang, Fei <fei.yang at intel.com>; Sripada, Radhakrishna
>> <radhakrishna.sripada at intel.com>; intel-gfx at lists.freedesktop.org
>> Cc: stable at vger.kernel.org; Ville Syrjälä <ville.syrjala at linux.intel.com>; Wilson,
>> Chris P <chris.p.wilson at intel.com>
>> Subject: Re: [PATCH v2] drm/i915/dpt: Use shmem for dpt objects
>>
>>
>> On 19/07/2023 21:53, Yang, Fei wrote:
>>>> On 18/07/2023 23:51, Radhakrishna Sripada wrote:
>>>>> Dpt objects that are created from internal get evicted when there is
>>>>> memory pressure and do not get restored when pinned during scanout.
>>>>> The pinned page table entries look corrupted and programming the
>>>>> display engine with the incorrect pte's result in DE throwing pipe faults.
>>>>>
>>>>> Create DPT objects from shmem and mark the object as dirty when
>>>>> pinning so that the object is restored when shrinker evicts an unpinned
>> buffer object.
>>>>>
>>>>> v2: Unconditionally mark the dpt objects dirty during pinning(Chris).
>>>>>
>>>>> Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation
>>>>> for dpt")
>>>>> Cc: <stable at vger.kernel.org> # v6.0+
>>>>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>>>>> Suggested-by: Chris Wilson <chris.p.wilson at intel.com>
>>>>> Signed-off-by: Fei Yang <fei.yang at intel.com>
>>>>> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada at intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/display/intel_dpt.c | 4 +++-
>>>>>     1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c
>>>>> b/drivers/gpu/drm/i915/display/intel_dpt.c
>>>>> index 7c5fddb203ba..fbfd8f959f17 100644
>>>>> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
>>>>> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
>>>>> @@ -166,6 +166,8 @@ struct i915_vma *intel_dpt_pin(struct
>> i915_address_space *vm)
>>>>>                i915_vma_get(vma);
>>>>>        }
>>>>>
>>>>> +    dpt->obj->mm.dirty = true;
>>>>> +
>>>>>        atomic_dec(&i915->gpu_error.pending_fb_pin);
>>>>>        intel_runtime_pm_put(&i915->runtime_pm, wakeref);
>>>>>
>>>>> @@ -261,7 +263,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
>>>>>                dpt_obj = i915_gem_object_create_stolen(i915, size);
>>>>>        if (IS_ERR(dpt_obj) && !HAS_LMEM(i915)) {
>>>>>                drm_dbg_kms(&i915->drm, "Allocating dpt from smem\n");
>>>>> -            dpt_obj = i915_gem_object_create_internal(i915, size);
>>>>> +            dpt_obj = i915_gem_object_create_shmem(i915, size);
>>>>>        }
>>>>>        if (IS_ERR(dpt_obj))
>>>>>                return ERR_CAST(dpt_obj);
>>>>
>>>> Okay I think I get it after some more looking at the DPT code paths.
>>>> Problem seems pretty clear - page tables are stored in dpt_obj and so
>>>> are lost when backing store is discarded.
>>>>
>>>> Changing to shmem object indeed looks the easiest option.
>>>>
>>>> Some related thoughts:
>>>>
>>>> 1)
>>>> I wonder if intel_dpt_suspend/resume remain needed after this patch.
>>>> Could you investigate please? On a glance their job was to restore the
>>>> PTEs which would be lost from internal objects backing storage. With
>>>> shmem objects that content should be preserved.
>>>
>>> intel_dpt_suspend is "suspending" the whole VM where, not only the dpt
>>> objects are mapped into, but also the framebuffer objects. I don't have
>>> much knowledge on how the framebuffer objects are managed, but the
>> suspend
>>> resume path still look necessary to me, unless the content of these
>>> framebuffer objects are also preserved.
>>
>> I don't think it has anything to do with fb content, but you are correct
>> it is still needed. Because 9755f055f512 ("drm/i915: Restore memory
>> mapping for DPT FBs across system suspend/resume") reminds me backing
>> store for DPT PTEs can be either lmem, stolen or internal (now shmem).
>> Even though with this patch internal is out of the picture, stolen
>> remains and so the issue of losing the page table content remains.
>> Perhaps resume could be optimised to only restore PTEs when VM page
>> tables are backed by stolen which may win some suspend/resume speed on
>> some platforms.
> 
> I will have to look into how suspend resume will change with the current flow
> as you said it can be looked in a later patch.

Thanks!

>>>> 2)
>>>> I wonder if i915_vma_flush_writes should be used (as a companion of
>>>> i915_vma_pin_iomap) from DPT dpt_bind_vma, dpt_insert_entries, etc. But
>>>> then I am also not sure if it does the right thing for the
>>>> i915_gem_object_pin_map path of i915_vma_pin_iomap. Perhaps it should
>>>> call __i915_gem_object_flush_map itself for that mapping flavour and
>>>> not do the ggtt flushing in that case.
> I am not sure if dpt_bind_vma will be called each time during pinning. IMO it gets called
> Only when the fb object needs to be bind after and unbind(triggered during obj destroy).
> Do you think if i915_vma_flush_writes should not be used if dpt objects are created from internal?

No, I *think* correct API usage is supposed to be: If one uses 
i915_vma_pin_iomap() for CPU access, then one should call 
i915_vma_flush_writes() after CPU writes are done - presumably as a 
barrier to make sure writes are visible before proceeding.

If that is correct, the I noticed problem (or I am missing something), 
that i915_vma_flush_writes only does something for the one of the three 
ways i915_vma_pin_iomap() can set up the CPU visible mapping (the 
ggtt->iomap path).

The i915_gem_object_lmem_io_map() path, relevant on dgfx, has no 
flushing. Maybe it does need it due WC, or maybe flushing the 
write-combine buffers would still be needed.

And the i915_gem_object_pin_map() path is also WC and in theory there is 
the corresponding __i915_gem_object_flush_map().

Don't know, maybe I am indeed missing something.

But for instance if __i915_gem_object_flush_map() *was* called from 
i915_vma_flush_writes(), and the latter *was* called after dpt_bind_vma 
does its thing, then notice how obj->mm.dirty *would* be set by that 
helper. Removing the need for this patch.

So perhaps i915_vma_flush_writes() should just dirty the object, *if* 
the idea is to be called after CPU writes. And perhaps it should call 
i915_gem_object_flush_map *if* the method of mmaping was other than 
ggtt. But the information would have to be recorded first, probably same 
as the i915_gem_object_pin_map() path records it in the bit 0 of the 
vma->iomap pointer.

> Or should we have a different flavor of i915_vm_pin_iomap that skips i915_vma_set_ggtt_write
> so that we can drop i915_vma_flush_writes during unpinning and move i915_vma_set_ggtt_write
> to dpt_insert_entires and do i915_vma_flush during clear range? Then I guess __i915_gem_object_flush_map
> called during vma bind and not object pinning. In either case I believe it is a larger cleanup
> which requires more extensive validation and analysis.

Yes definitely wasn't suggesting to do it in this patch.

>>>> In summary I think the fix is safe and correct but at least point 1) I
>>>> think needs looking into. It can be a follow up work too.
> 
> If you think this fix can work then I will look into the suspend/resume as a follow up and will appreciate
> an r-b for this change.  I believe 2) is a larger cleanup that may not be immediately required. I will have
> to dig more into the ramifications of the changes proposed above.
> 
> Thoughts ?

Yeah it is fine. I assumed someone else would r-b but I can too.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko