[PATCH] drm/amdgpu: annotate a false positive recursive locking

Fri Aug 7 09:34:13 UTC 2020

Am 07.08.20 um 11:23 schrieb Daniel Vetter:
> On Fri, Aug 7, 2020 at 11:20 AM Daniel Vetter <daniel at ffwll.ch> wrote:
>> On Fri, Aug 7, 2020 at 11:08 AM Christian König
>> <christian.koenig at amd.com> wrote:
>>> [SNIP]
>>>>>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
>>>>>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
>>>>>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
>>>>>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
>>>>> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
>>>>>
>>>>> I still think that a single rwsem for the whole hive is still the best option here.
>>>>>
>>>>> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
>>>> That's a really good argument, but I still hesitate to merge this patch.
>>>> How severe is the lockdep splat?
>>>>
>>>> At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
>>>> [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
>>> Yeah, but this explanation is not really sufficient. Again this is not
>>> an limitation of init_rwsem, but intentional behavior.
>>>
>>> In other words in most cases lockdep should splat if you use rw
>>> semaphores like this.
>>>
>>> That we avoid this by locking multiple amdgpu device always in the same
>>> order is rather questionable driver design.
>> Yeah don't give multiple locking classes like that, that's fairly
>> terrible.

Ok, that is exactly the answer I feared we would get.

>> The right fix is mutex_lock_nest_lock, which we're using in
>> ww_mutex or in which is also used in mm_take_all_locks.

Ah! Enlightenment! So in this case we should use down_write_nested() in 
this special space and all is well?

>>
>> If you solve the locking ordering by sorting all the locks you need
>> (ugh), then you can also just use a fake lockdep_map for your special
>> function only.
>>
>> Also in general the idea of an rwsem in conjunction with xgmi cross
>> device memory handling just plain freaks me out :-) Please make sure
>> you prime this lock against dma_resv and dma_fence (and maybe also
>> drm_modeset_lock if this lock is outside of dma_resv), and ideally
>> this should only ever be used for setup stuff when loading drivers. I
>> guess that's why you went with a per-device rwsem. If it's really only
>> for setup then just do the simple thing of having an
>> xgmi_hive_reconfigure lock, which all the rwsems nest within, and no
>> fancy lock ordering or other tricks.

Well this is for the group reset of all devices in a hive and you need 
to reboot to change the order the device are in the list.

Regards,
Christian.

>>
>>>>    Core network driver (net/core/dev.c) has the similar use case with us.
>>> Just took a look at that, but this is completely different use case. The
>>> lockdep classes are grouped by driver type to note the difference in the
>>> network stack.
>>>
>>> A quick search showed that no other device driver is doing this in the
>>> kernel, so I'm not sure if such a behavior wouldn't be rejected. Adding
>>> Daniel to comment on this as well.
>>>
>>> Felix had some really good arguments to make that an urgent fix, so
>>> either we come up with some real rework of this or at least add a big
>>> "/* TODO....*/".
>> Aside from "I have questions" I don't think there's any reason to no
>> fix this correctly with mutex_lock_nest_lock. Really shouldn't be a
>> semantic change from what I'm understand here.
> Also one more with my drm maintainer hat on, as a heads-up: Dave&me
> looked at i915 gem code a bit too much, and we're appalled at the
> massive over-use of lockdep class hacks and dubious nesting
> annotations. Expect crack down on anyone who tries to play clever
> tricks here, we have a few too many already :-)
>
> So no mutex_lock_nested (totally different beast from
> mutex_lock_nest_lock, and really unsafe since it's a runtime change of
> the lockdep class) and also not any lockdep_set_class trickery or
> anything like that. We need locking hierarchies which mere humans can
> comprehend and review.
>
> Cheers, Daniel
>
>> Cheers, Daniel
>>
>>> Regards,
>>> Christian.
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=02%7C01%7Cchristian.koenig%40amd.com%7Cfa969d301e9f4a98c05608d83ab3a0f8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637323890463734977&sdata=dCB0oOQ9hniHqB%2FEUZ15r6lZNOSRYbRp9pXNhL7SVDM%3D&reserved=0
>
>