[RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs
Andrey Grodzovsky
andrey.grodzovsky at amd.com
Mon Jan 17 19:21:08 UTC 2022
On 2022-01-17 2:17 p.m., Christian König wrote:
> Am 17.01.22 um 20:14 schrieb Andrey Grodzovsky:
>>
>> Ping on the question
>>
>
> Oh, my! That was already more than a week ago and is completely
> swapped out of my head again.
>
>> Andrey
>>
>> On 2022-01-05 1:11 p.m., Andrey Grodzovsky wrote:
>>>>> Also, what about having the reset_active or in_reset flag in the
>>>>> reset_domain itself?
>>>>
>>>> Of hand that sounds like a good idea.
>>>
>>>
>>> What then about the adev->reset_sem semaphore ? Should we also move
>>> this to reset_domain ? Both of the moves have functional
>>> implications only for XGMI case because there will be contention
>>> over accessing those single instance variables from multiple devices
>>> while now each device has it's own copy.
>
> Since this is a rw semaphore that should be unproblematic I think. It
> could just be that the cache line of the lock then plays ping/pong
> between the CPU cores.
>
>>>
>>> What benefit the centralization into reset_domain gives - is it for
>>> example to prevent one device in a hive trying to access through
>>> MMIO another one's
>>> VRAM (shared FB memory) while the other one goes through reset ?
>
> I think that this is the killer argument for a centralized lock, yes.
np, i will add a patch with centralizing both flag into reset domain and
resend.
Andrey
>
> Christian.
>
>>>
>>> Andrey
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20220117/a564200f/attachment.htm>
More information about the dri-devel
mailing list