[PATCH 4/7] drm/ttm: move LRU walk defines into new internal header

Wed Oct 2 11:32:08 UTC 2024

Ah, yes sorry totally forgotten about that.

Give me till Friday to swap everything back into my head again.

Christian.

Am 02.10.24 um 13:30 schrieb Thomas Hellström:
> Hi, Christian,
>
> Ping? Can i get an ack to proceed with this?
>
> Thanks,
> Thomas
>
>
>
> On Wed, 2024-09-18 at 14:57 +0200, Thomas Hellström wrote:
>> Sima, Christian
>>
>> I've updated the shrinker series now with a guarded for_each macro
>> instead:
>>
>> https://patchwork.freedesktop.org/patch/614514/?series=131815&rev=9
>>
>> (Note I forgot to remove the export of the previous LRU walker).
>>
>>   so the midlayer argument is now not an issue anymore. The cleanup.h
>> guard provides some additional protection against drivers exiting the
>> LRU loop early.
>>
>> So remaining is the question whether the driver is allowed to discard
>> a
>> suggested bo to shrink from TTM.
>>
>> Arguments for:
>>
>> 1) Not allowing that would require teaching TTM about purgeable
>> objects.
>> 2) Devices who need the blitter during shrinking would want to punt
>> runtime_pm_get() to kswapd to avoid sleeping direct reclaim.
>> 3) If those devices end up blitting (LNL) to be able to shrink, they
>> would want to punt waiting for the fence to signal to kswapd to avoid
>> waiting in direct reclaim.
>> 4) It looks like we need to resort to folio_trylock in the shmem
>> backup
>> backend when shrinking is called for gfp_t = GFP_NOFS. A failing
>> trylock will require a new bo.
>>
>> Arguments against:
>> None really. I thought the idea of demidlayering would be to allow
>> the
>> driver more freedom.
>>
>> So any feedback appreciated. If that is found acceptable we can
>> proceed
>> with reviewing this patch and also with the shrinker series.
>>
>> Thanks,
>> Thomas
>>
>>
>> On Mon, 2024-09-02 at 13:07 +0200, Daniel Vetter wrote:
>>> On Wed, Aug 28, 2024 at 02:20:34PM +0200, Christian König wrote:
>>>> Am 27.08.24 um 19:53 schrieb Daniel Vetter:
>>>>> On Tue, Aug 27, 2024 at 06:52:13PM +0200, Daniel Vetter wrote:
>>>>>> On Thu, Aug 22, 2024 at 03:19:29PM +0200, Christian König
>>>>>> wrote:
>>>>>>> Completely agree that this is complicated, but I still
>>>>>>> don't
>>>>>>> see the need
>>>>>>> for it.
>>>>>>>
>>>>>>> Drivers just need to use pm_runtime_get_if_in_use() inside
>>>>>>> the shrinker and
>>>>>>> postpone all hw activity until resume.
>>>>>> Not good enough, at least long term I think. Also postponing
>>>>>> hw
>>>>>> activity
>>>>>> to resume doesn't solve the deadlock issue, if you still need
>>>>>> to grab ttm
>>>>>> locks on resume.
>>>>> Pondered this specific aspect some more, and I think you still
>>>>> have a race
>>>>> here (even if you avoid the deadlock): If the condiditional
>>>>> rpm_get call
>>>>> fails there's no guarantee that the device will suspend/resume
>>>>> and clean
>>>>> up the GART mapping.
>>>> Well I think we have a major disconnect here. When the device is
>>>> powered
>>>> down there is no GART mapping to clean up any more.
>>>>
>>>> In other words GART is a table in local memory (VRAM) when the
>>>> device is
>>>> powered down this table is completely destroyed. Any BO which was
>>>> mapped
>>>> inside this table is now not mapped any more.
>>>>
>>>> So when the shrinker wants to evict a BO which is marked as
>>>> mapped
>>>> to GART
>>>> and the device is powered down we just skip the GART unmapping
>>>> part
>>>> because
>>>> that has already implicitly happened during power down.
>>>>
>>>> Before mapping any BO into the GART again we power the GPU up
>>>> through the
>>>> runtime PM calls. And while powering it up again the GART is
>>>> restored.
>>> My point is that you can't tell whether the device will power down
>>> or
>>> not,
>>> you can only tell whether there's a chance it might be powering
>>> down
>>> and
>>> so you can't get at the rpm reference without deadlock issues.
>>>
>>>>> The race gets a bit smaller if you use
>>>>> pm_runtime_get_if_active(), but even then you might catch it
>>>>> right when
>>>>> resume almost finished.
>>>> What race are you talking about?
>>>>
>>>> The worst thing which could happen is that we restore a GART
>>>> entry
>>>> which
>>>> isn't needed any more, but that is pretty much irrelevant since
>>>> we
>>>> only
>>>> clear them to avoid some hw bugs.
>>> The race I'm seeing is where you thought the GART entry is not
>>> issue,
>>> tossed an object, but the device didn't suspend, so might still use
>>> it.
>>>
>>> I guess if we're clearly separating the sw allocation of the TTM_TT
>>> with
>>> the physical entries in the GART that should all work, but feels a
>>> bit
>>> tricky. The race I've seen is essentially these two getting out of
>>> sync.
>>>
>>> So maybe it was me who's stuck.
>>>
>>> What I wonder is whether it works in practice, since on the restore
>>> side
>>> you need to get some locks to figure out which gart mappings exist
>>> and
>>> need restoring. And that's the same locks as the shrinker needs to
>>> figure
>>> out whether it might need to reap a gart mapping.
>>>
>>> Or do you just copy the gart entries over and restore them exactly
>>> as-is,
>>> so that there's no shared locks?
>>>
>>>>> That means we'll have ttm bo hanging around with GART
>>>>> allocations/mappings
>>>>> which aren't actually valid anymore (since they might escape
>>>>> the
>>>>> cleanup
>>>>> upon resume due to the race). That doesn't feel like a solid
>>>>> design
>>>>> either.
>>>> I'm most likely missing something, but I'm really scratching my
>>>> head where
>>>> you see a problem here.
>>> I guess one issue is that at least traditionally, igfx drivers have
>>> nested
>>> runtime pm within dma_resv lock. And dgpu drivers the other way
>>> round.
>>> Which is a bit awkward if you're trying for common code.
>>>
>>> Cheers, Sima