[Intel-xe] [PATCH] drm/xe: Use Xe BO pin / unpin functions for FB pin

Christian König christian.koenig at amd.com
Thu Apr 6 09:32:46 UTC 2023


Am 05.04.23 um 19:04 schrieb Matthew Auld:
> On Wed, 5 Apr 2023 at 15:28, Matthew Brost <matthew.brost at intel.com> wrote:
>> On Wed, Apr 05, 2023 at 02:14:22PM +0000, Matthew Brost wrote:
>>> On Wed, Apr 05, 2023 at 02:19:35PM +0100, Matthew Auld wrote:
>>>> On Wed, 5 Apr 2023 at 05:19, Matthew Brost <matthew.brost at intel.com> wrote:
>>>>> Need to use XE BO pin / unpin for suspend / resume to work properly.
>>>>> Part this fix too is too allow xe_bo_pin to be called more than once.
>>>>>
>>>>> Should fix:
>>>>> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/244
>>>>>
>>>>> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>>>> Change makes sense to me, but I think we also need to update xe_bo_unpin()?
>>>>
>>> Yep, thought that coded correctly but was looking at xe_bo_unpin_external().
>>>
>>>> But I must be misunderstanding something here. AFAICT we can have
>>>> various pinned VRAM objects, and xe_bo_pin() still adds them to
>>>> pinned.kernel_bo_present, where during suspend we evict everything on
>>>> that list to system memory. But as these are all still pinned, suspend
>>>> is still going to fail with this patch, given the recent TTM changes.
>>> What reference recent TTM changes? This all worked at one point in time.
>>>
>> Oh, I see:
>> git format-patch -1 f87c1f0b7b79b
>>
>> I would've NAK'd that one as this broke us. What you think is the best
>> way to move forward on this one? I like the idea of using TTM to move
>> the objects...
>>
>> I know we could just hack this and set pin_count=0 before calling
>> validate aand restore it after? Horried hack though?
>>
>> Revert the above patch + post upstream?
> Christian, can you share your thoughts here?
>
> This is regarding the TTM change: f87c1f0b7b79 ("drm/ttm: prevent
> moving of pinned BOs"). To quickly summarize, it looks like Xe was
> relying on being able to move pinned BOs for suspend-resume, where the
> driver has some special kernel-internal VRAM buffers that are
> perma-pinned, but need to be moved to system memory during suspend and
> then copied back when resuming. This was all handled by just doing
> something like ttm_bo_validate(PL_SYSTEM), without needing any special
> handling. But this ofc no longer works.

Yeah, exactly that's what this check here should prevent :)

> So wondering if it makes any sense to move the decision back into the
> driver, for whether to treat moving pinned objects as an error or not,
> given the above use case with suspend-resume in Xe?

Well pinning a BO means that it's fixed and can't move any more.

The background is that the physical location (domain, address whatever) 
for this BO is used somewhere in the hardware and you need to notify the 
hardware if that ever changes. Typical applications are scanout, 
firmware buffers etc....

Now if you still want to evacuate those buffers during suspend/resume 
you basically have three options:

1. Manually make sure inside your driver that the location doesn't 
change during evacuation/moving it back in again.
     You can easily allocate additional backing store for you BOs and 
then copy the content back/force manually similar to what 
ttm_bo_validate() would do.
     This wouldn't trigger the warning above and should work as well, 
see functions ttm_bo_mem_space() and ttm_bo_move_null().

2. Unpin them on suspend, re-pin them on resume.
     This is what we do for example for scanout. Re-pinning then usually 
means that you also re-program the hw with the new location information.

3. Stop caring about evacuation during suspend.
     We actually started to do this for a bunch of the read-only 
firmware buffers and just re-load the firmware after resume.
     Has some pitfalls as well (if for example the fw changes), but 
generally seems to work quite fine for a lot of use cases.

If you want to keep moving the BOs just using approach #1 should work 
for you, but approach #2 or #3 is probably cleaner from a high level 
perspective.

Regards,
Christian.

>
>> BTW, this patch is needed regardless with the fix in xe_bo_unpin().
>>
>> Matt
>>
>>> Matt
>>>
>>>> What am I missing?
>>>>
>>>>> ---
>>>>>   drivers/gpu/drm/xe/display/xe_fb_pin.c | 4 ++--
>>>>>   drivers/gpu/drm/xe/xe_bo.c             | 5 +++--
>>>>>   2 files changed, 5 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>>> index 65c0bc28a3d1..c7c849be288f 100644
>>>>> --- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>>> +++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>>> @@ -205,7 +205,7 @@ static struct i915_vma *__xe_pin_fb_vma(struct intel_framebuffer *fb,
>>>>>
>>>>>          ret = xe_bo_validate(bo, NULL, true);
>>>>>          if (!ret)
>>>>> -               ttm_bo_pin(&bo->ttm);
>>>>> +               xe_bo_pin(bo);
>>>>>          ttm_bo_unreserve(&bo->ttm);
>>>>>          if (ret)
>>>>>                  goto err;
>>>>> @@ -222,7 +222,7 @@ static struct i915_vma *__xe_pin_fb_vma(struct intel_framebuffer *fb,
>>>>>
>>>>>   err_unpin:
>>>>>          ttm_bo_reserve(&bo->ttm, false, false, NULL);
>>>>> -       ttm_bo_unpin(&bo->ttm);
>>>>> +       xe_bo_unpin(bo);
>>>>>          ttm_bo_unreserve(&bo->ttm);
>>>>>   err:
>>>>>          kfree(vma);
>>>>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>>>>> index 3a482c61c3ec..25c696bd380c 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_bo.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>>>>> @@ -1278,8 +1278,8 @@ int xe_bo_pin(struct xe_bo *bo)
>>>>>           */
>>>>>          XE_BUG_ON(bo->ttm.base.import_attach);
>>>>>
>>>>> -       /* We only expect at most 1 pin */
>>>>> -       XE_BUG_ON(xe_bo_is_pinned(bo));
>>>>> +       if (xe_bo_is_pinned(bo))
>>>>> +               goto pin;
>>>>>
>>>>>          err = xe_bo_validate(bo, NULL, false);
>>>>>          if (err)
>>>>> @@ -1308,6 +1308,7 @@ int xe_bo_pin(struct xe_bo *bo)
>>>>>                  }
>>>>>          }
>>>>>
>>>>> +pin:
>>>>>          ttm_bo_pin(&bo->ttm);
>>>>>
>>>>>          /*
>>>>> --
>>>>> 2.34.1
>>>>>



More information about the Intel-xe mailing list