[Intel-gfx] [PATCH v2] drm/i915/dpt: Use shmem for dpt objects

Fri Jul 21 23:54:48 UTC 2023

Hi Tvrtko,

> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> Sent: Friday, July 21, 2023 1:17 AM
> To: Sripada, Radhakrishna <radhakrishna.sripada at intel.com>; Yang, Fei
> <fei.yang at intel.com>; intel-gfx at lists.freedesktop.org
> Cc: stable at vger.kernel.org; Ville Syrjälä <ville.syrjala at linux.intel.com>; Wilson,
> Chris P <chris.p.wilson at intel.com>
> Subject: Re: [PATCH v2] drm/i915/dpt: Use shmem for dpt objects
> 
> 
> On 20/07/2023 18:02, Sripada, Radhakrishna wrote:
> > Hi Tvrtko,
> >
> >> -----Original Message-----
> >> From: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> >> Sent: Thursday, July 20, 2023 2:17 AM
> >> To: Yang, Fei <fei.yang at intel.com>; Sripada, Radhakrishna
> >> <radhakrishna.sripada at intel.com>; intel-gfx at lists.freedesktop.org
> >> Cc: stable at vger.kernel.org; Ville Syrjälä <ville.syrjala at linux.intel.com>;
> Wilson,
> >> Chris P <chris.p.wilson at intel.com>
> >> Subject: Re: [PATCH v2] drm/i915/dpt: Use shmem for dpt objects
> >>
> >>
> >> On 19/07/2023 21:53, Yang, Fei wrote:
> >>>> On 18/07/2023 23:51, Radhakrishna Sripada wrote:
> >>>>> Dpt objects that are created from internal get evicted when there is
> >>>>> memory pressure and do not get restored when pinned during scanout.
> >>>>> The pinned page table entries look corrupted and programming the
> >>>>> display engine with the incorrect pte's result in DE throwing pipe faults.
> >>>>>
> >>>>> Create DPT objects from shmem and mark the object as dirty when
> >>>>> pinning so that the object is restored when shrinker evicts an unpinned
> >> buffer object.
> >>>>>
> >>>>> v2: Unconditionally mark the dpt objects dirty during pinning(Chris).
> >>>>>
> >>>>> Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation
> >>>>> for dpt")
> >>>>> Cc: <stable at vger.kernel.org> # v6.0+
> >>>>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> >>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> >>>>> Suggested-by: Chris Wilson <chris.p.wilson at intel.com>
> >>>>> Signed-off-by: Fei Yang <fei.yang at intel.com>
> >>>>> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada at intel.com>
> >>>>> ---
> >>>>>     drivers/gpu/drm/i915/display/intel_dpt.c | 4 +++-
> >>>>>     1 file changed, 3 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c
> >>>>> b/drivers/gpu/drm/i915/display/intel_dpt.c
> >>>>> index 7c5fddb203ba..fbfd8f959f17 100644
> >>>>> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
> >>>>> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
> >>>>> @@ -166,6 +166,8 @@ struct i915_vma *intel_dpt_pin(struct
> >> i915_address_space *vm)
> >>>>>                i915_vma_get(vma);
> >>>>>        }
> >>>>>
> >>>>> +    dpt->obj->mm.dirty = true;
> >>>>> +
> >>>>>        atomic_dec(&i915->gpu_error.pending_fb_pin);
> >>>>>        intel_runtime_pm_put(&i915->runtime_pm, wakeref);
> >>>>>
> >>>>> @@ -261,7 +263,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
> >>>>>                dpt_obj = i915_gem_object_create_stolen(i915, size);
> >>>>>        if (IS_ERR(dpt_obj) && !HAS_LMEM(i915)) {
> >>>>>                drm_dbg_kms(&i915->drm, "Allocating dpt from smem\n");
> >>>>> -            dpt_obj = i915_gem_object_create_internal(i915, size);
> >>>>> +            dpt_obj = i915_gem_object_create_shmem(i915, size);
> >>>>>        }
> >>>>>        if (IS_ERR(dpt_obj))
> >>>>>                return ERR_CAST(dpt_obj);
> >>>>
> >>>> Okay I think I get it after some more looking at the DPT code paths.
> >>>> Problem seems pretty clear - page tables are stored in dpt_obj and so
> >>>> are lost when backing store is discarded.
> >>>>
> >>>> Changing to shmem object indeed looks the easiest option.
> >>>>
> >>>> Some related thoughts:
> >>>>
> >>>> 1)
> >>>> I wonder if intel_dpt_suspend/resume remain needed after this patch.
> >>>> Could you investigate please? On a glance their job was to restore the
> >>>> PTEs which would be lost from internal objects backing storage. With
> >>>> shmem objects that content should be preserved.
> >>>
> >>> intel_dpt_suspend is "suspending" the whole VM where, not only the dpt
> >>> objects are mapped into, but also the framebuffer objects. I don't have
> >>> much knowledge on how the framebuffer objects are managed, but the
> >> suspend
> >>> resume path still look necessary to me, unless the content of these
> >>> framebuffer objects are also preserved.
> >>
> >> I don't think it has anything to do with fb content, but you are correct
> >> it is still needed. Because 9755f055f512 ("drm/i915: Restore memory
> >> mapping for DPT FBs across system suspend/resume") reminds me backing
> >> store for DPT PTEs can be either lmem, stolen or internal (now shmem).
> >> Even though with this patch internal is out of the picture, stolen
> >> remains and so the issue of losing the page table content remains.
> >> Perhaps resume could be optimised to only restore PTEs when VM page
> >> tables are backed by stolen which may win some suspend/resume speed on
> >> some platforms.
> >
> > I will have to look into how suspend resume will change with the current flow
> > as you said it can be looked in a later patch.
> 
> Thanks!
> 
> >>>> 2)
> >>>> I wonder if i915_vma_flush_writes should be used (as a companion of
> >>>> i915_vma_pin_iomap) from DPT dpt_bind_vma, dpt_insert_entries, etc.
> But
> >>>> then I am also not sure if it does the right thing for the
> >>>> i915_gem_object_pin_map path of i915_vma_pin_iomap. Perhaps it
> should
> >>>> call __i915_gem_object_flush_map itself for that mapping flavour and
> >>>> not do the ggtt flushing in that case.
> > I am not sure if dpt_bind_vma will be called each time during pinning. IMO it
> gets called
> > Only when the fb object needs to be bind after and unbind(triggered during obj
> destroy).
> > Do you think if i915_vma_flush_writes should not be used if dpt objects are
> created from internal?
> 
> No, I *think* correct API usage is supposed to be: If one uses
> i915_vma_pin_iomap() for CPU access, then one should call
> i915_vma_flush_writes() after CPU writes are done - presumably as a
> barrier to make sure writes are visible before proceeding.
> 
> If that is correct, the I noticed problem (or I am missing something),
> that i915_vma_flush_writes only does something for the one of the three
> ways i915_vma_pin_iomap() can set up the CPU visible mapping (the
> ggtt->iomap path).
> 
> The i915_gem_object_lmem_io_map() path, relevant on dgfx, has no
> flushing. Maybe it does need it due WC, or maybe flushing the
> write-combine buffers would still be needed.
> 
> And the i915_gem_object_pin_map() path is also WC and in theory there is
> the corresponding __i915_gem_object_flush_map().
> 
> Don't know, maybe I am indeed missing something.
> 
> But for instance if __i915_gem_object_flush_map() *was* called from
> i915_vma_flush_writes(), and the latter *was* called after dpt_bind_vma
> does its thing, then notice how obj->mm.dirty *would* be set by that
> helper. Removing the need for this patch.
> 
> So perhaps i915_vma_flush_writes() should just dirty the object, *if*
> the idea is to be called after CPU writes. And perhaps it should call
> i915_gem_object_flush_map *if* the method of mmaping was other than
> ggtt. But the information would have to be recorded first, probably same
> as the i915_gem_object_pin_map() path records it in the bit 0 of the
> vma->iomap pointer.

So a question arises if marking the object as dirty/ doing i915_gem_object_flush_map
Needs to happen after vma bind or after pinning the dpt?

> 
> > Or should we have a different flavor of i915_vm_pin_iomap that skips
> i915_vma_set_ggtt_write
> > so that we can drop i915_vma_flush_writes during unpinning and move
> i915_vma_set_ggtt_write
> > to dpt_insert_entires and do i915_vma_flush during clear range? Then I guess
> __i915_gem_object_flush_map
> > called during vma bind and not object pinning. In either case I believe it is a
> larger cleanup
> > which requires more extensive validation and analysis.
> 
> Yes definitely wasn't suggesting to do it in this patch.
> 
> >>>> In summary I think the fix is safe and correct but at least point 1) I
> >>>> think needs looking into. It can be a follow up work too.
> >
> > If you think this fix can work then I will look into the suspend/resume as a
> follow up and will appreciate
> > an r-b for this change.  I believe 2) is a larger cleanup that may not be
> immediately required. I will have
> > to dig more into the ramifications of the changes proposed above.
> >
> > Thoughts ?
> 
> Yeah it is fine. I assumed someone else would r-b but I can too.
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Thank you for the review, merged the patch.

--Radhakrishna Sripada
> 
> Regards,
> 
> Tvrtko