[PATCH 2/2] drm/scheduler: Remove obsolete spinlock.
Christian König
ckoenig.leichtzumerken at gmail.com
Wed May 16 12:32:58 UTC 2018
Am 16.05.2018 um 14:28 schrieb Lucas Stach:
> Am Mittwoch, den 16.05.2018, 14:08 +0200 schrieb Christian König:
>> Yes, exactly.
>>
>> For normal user space command submission we should have tons of
>> locks
>> guaranteeing that (e.g. just the VM lock should do).
>>
>> For kernel moves we have the mutex for the GTT windows which protects
>> it.
>>
>> The could be problems with the UVD/VCE queues to cleanup the handles
>> when an application crashes.
> FWIW, etnaviv is currently completely unlocked in this path, but I
> believe that this isn't an issue as the sched fence seq numbers are per
> entity. So to actually end up with reversed seqnos one context has to
> preempt itself to do another submit, while the current one hasn't
> returned from kernel space, which I believe is a fairly theoretical
> issue. Is my understanding correct?
Yes. The problem is with the right timing this can be used to access
freed up memory.
If you then manage to place a page table in that freed up memory taking
over the system is just a typing exercise.
Regards,
Christian.
>
> Regards,
> Lucas
>
>> Am 16.05.2018 um 13:47 schrieb Andrey Grodzovsky:
>>> So are you saying that you expect this to be already in code for
>>> any
>>> usage of drm_sched_fence_create and drm_sched_entity_push_job ?
>>>
>>> lock()
>>>
>>> drm_sched_fence_create()
>>>
>>> ... (some code)
>>>
>>> drm_sched_entity_push_job()
>>>
>>> unlock()
>>>
>>>
>>> Andrey
>>>
>>> On 05/16/2018 07:23 AM, Christian König wrote:
>>>> drm_sched_fence_create() assigns a sequence number to the fence
>>>> it
>>>> creates.
>>>>
>>>> Now drm_sched_fence_create() is called by drm_sched_job_init()
>>>> to
>>>> initialize the jobs we want to push on the scheduler queue.
>>>>
>>>> When you now call drm_sched_entity_push_job() without a
>>>> protecting
>>>> lock it can happen that you push two (or even more) job with
>>>> reversed
>>>> sequence numbers.
>>>>
>>>> Since the sequence numbers are used to determine job completion
>>>> order
>>>> reversing them can seriously mess things up.
>>>>
>>>> So the spin lock should be superfluous, if it isn't we have a
>>>> much
>>>> larger bug we need to fix.
>>>>
>>>> Christian.
>>>>
>>>> Am 16.05.2018 um 13:15 schrieb Andrey Grodzovsky:
>>>>> Can you please elaborate more, maybe give an example - I don't
>>>>> understand yet the problematic scenario.
>>>>>
>>>>> Andrey
>>>>>
>>>>>
>>>>> On 05/16/2018 02:50 AM, Christian König wrote:
>>>>>> No, that spinlock is indeed incorrect. I
>>>>>>
>>>>>> See even when we protect the spsc queue with a spinlock that
>>>>>> doesn't make it correct. It can happen that the jobs pushed
>>>>>> to the
>>>>>> queue are reversed in their sequence order and that can
>>>>>> cause
>>>>>> severe problems in the memory management.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>> Am 16.05.2018 um 05:33 schrieb Grodzovsky, Andrey:
>>>>>>> Yeah, that what I am not sure about... It's lockless in a
>>>>>>> sense of
>>>>>>> single producer single consumer but not for multiple
>>>>>>> concurrent
>>>>>>> producers... So now I think this spinlock should stay
>>>>>>> there... It
>>>>>>> just looked useless to me at first sight...
>>>>>>>
>>>>>>> Andrey
>>>>>>>
>>>>>>> ________________________________________
>>>>>>> From: Zhou, David(ChunMing)
>>>>>>> Sent: 15 May 2018 23:04:44
>>>>>>> To: Grodzovsky, Andrey; amd-gfx at lists.freedesktop.org;
>>>>>>> dri-devel at lists.freedesktop.org
>>>>>>> Cc: Koenig, Christian
>>>>>>> Subject: Re: [PATCH 2/2] drm/scheduler: Remove obsolete
>>>>>>> spinlock.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2018年05月16日 03:31, Andrey Grodzovsky wrote:
>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.c
>>>>>>>> om>
>>>>>>>> ---
>>>>>>>> drivers/gpu/drm/scheduler/gpu_scheduler.c | 4 ----
>>>>>>>> include/drm/gpu_scheduler.h | 1 -
>>>>>>>> 2 files changed, 5 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>>> b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>>> index 1f1dd70..2569a63 100644
>>>>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>>> @@ -140,7 +140,6 @@ int drm_sched_entity_init(struct
>>>>>>>> drm_gpu_scheduler *sched,
>>>>>>>> entity->last_scheduled = NULL;
>>>>>>>>
>>>>>>>> spin_lock_init(&entity->rq_lock);
>>>>>>>> - spin_lock_init(&entity->queue_lock);
>>>>>>>> spsc_queue_init(&entity->job_queue);
>>>>>>>>
>>>>>>>> atomic_set(&entity->fence_seq, 0);
>>>>>>>> @@ -424,11 +423,8 @@ void
>>>>>>>> drm_sched_entity_push_job(struct
>>>>>>>> drm_sched_job *sched_job,
>>>>>>>>
>>>>>>>> trace_drm_sched_job(sched_job, entity);
>>>>>>>>
>>>>>>>> - spin_lock(&entity->queue_lock);
>>>>>>>> first = spsc_queue_push(&entity->job_queue,
>>>>>>>> &sched_job->queue_node);
>>>>>>>>
>>>>>>>> - spin_unlock(&entity->queue_lock);
>>>>>>> Is your spsc safely to be added simultaneously?
>>>>>>>
>>>>>>> Regards,
>>>>>>> David Zhou
>>>>>>>> -
>>>>>>>> /* first job wakes up scheduler */
>>>>>>>> if (first) {
>>>>>>>> /* Add the entity to the run queue */
>>>>>>>> diff --git a/include/drm/gpu_scheduler.h
>>>>>>>> b/include/drm/gpu_scheduler.h
>>>>>>>> index 350a62c..683eb65 100644
>>>>>>>> --- a/include/drm/gpu_scheduler.h
>>>>>>>> +++ b/include/drm/gpu_scheduler.h
>>>>>>>> @@ -56,7 +56,6 @@ struct drm_sched_entity {
>>>>>>>> spinlock_t rq_lock;
>>>>>>>> struct drm_gpu_scheduler *sched;
>>>>>>>>
>>>>>>>> - spinlock_t queue_lock;
>>>>>>>> struct spsc_queue job_queue;
>>>>>>>>
>>>>>>>> atomic_t fence_seq;
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list