[PATCH v6 2/8] drm/ttm: Add ttm_bo_access

Wed Nov 27 13:21:23 UTC 2024

Am 26.11.24 um 18:49 schrieb Matthew Brost:
> On Tue, Nov 26, 2024 at 09:19:47AM +0100, Christian König wrote:
>> Am 25.11.24 um 18:27 schrieb Matthew Brost:
>>> On Mon, Nov 25, 2024 at 05:19:54PM +0100, Christian König wrote:
>>>> Am 25.11.24 um 16:29 schrieb Matthew Brost:
>>>>> On Fri, Nov 15, 2024 at 10:27:59AM -0800, Matthew Brost wrote:
>>>>>> [SNIP]
>>>>>> We use this interface to read a BO marked with a dumpable flag during a
>>>>>> GPU hang in our error capture code. This is an internal KMD feature, not
>>>>>> directly exposed to user space. Would adding this helper be acceptable
>>>>>> for this use case? I can add kernel indicating the current restrictions
>>>>>> of the helper (do not directly expose to user space) too if that would
>>>>>> help.
>>>>>>
>>>>> Christian - ping on above.
>>>> Sorry, I will try to give those mailing list tasks a bit more time in before
>>>> the xmas holidays.
>>>>
>>>> That is an acceptable use case, but the problem is that this helper won't
>>>> work for that.
>>>>
>>>> See during a GPU hang you can't lock BOs, so how do you want to look into
>>>> their content with the peek helper?
>>>>
>>> Agree we cannot lock BO directly in GPU hang path (TDR). Our error
>>> capture code takes a snapshot of some the GPU state which is small and
>>> safe to capture in TDR and kicks a worker which opportunistically
>>> captures the VM state which has been marked to be captured. This is
>>> where the helper is called and it is safe to lock the BO.
>> Yeah that sounds like it should work.
>>
>> No objections from my side for that use case, but I would rather like to
>> keep the code inside ttm_bo_vm.c.
>>
> Thanks, reposted with code inside ttm_bo_vm.c. Any objection to merging
> entire series through drm-xe-next and then backporting single TTM patch
> drm-misc-next?

No need for a backport as long as nobody in drm-misc-next depends on that.

As far as I can see the change is small enough to not cause any 
conflicts, so merging through drm-xe-next is fine with me.

Christian.

>
> Matt
>
>> Crash dumping is usually something associated with the VMA even if it's a
>> bit special here for the VM state.
>>
>> Regards,
>> Christian.
>>
>>> Matt
>>>
>>>> The only thing you could potentially do is to trylock the BO and then dump,
>>>> but that would most likely be a bit unreliable.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>> Matt