[PATCH 03/14] drm/amdgpu: Add helper to turn an existing VM into a compute VM

Christian König christian.koenig at amd.com
Tue Mar 6 08:20:47 UTC 2018


Am 05.03.2018 um 23:33 schrieb Felix Kuehling:
> On 2018-03-05 12:50 PM, Christian König wrote:
>>>> The easiest way to work around this to to add a separate IOCTL which
>>>> waits for VM updates to complete. Should be trivial to use
>>>> vm->last_update for this.
>>> For the cost of an extra system call. That means every GPU mapping
>>> operation now requires two system calls. With Spectre workarounds, that
>>> can be a lot more expensive.
>> There is still very little overhead associated with that.
>>
>> And additional to this as long you actually need to wait it doesn't
>> matter if you wait directly in one system call or if you split it into
>> two. The time from the initial call to the end of the operation stays
>> the same.
> That's if you have waiting. For CPU page table updates, there is no
> waiting, so the extra system call is wasted.
>
> Another alternative would be to make the ioctl fail with -EINTR instead.
> That way the Thunk would be in charge of handling the restart, and we
> wouldn't have to pretend that no state was changed by the first call.
>
>> You can then also pipeline multiple mappings and wait for all of them
>> to finish in the end.
> Our Thunk mapping function already bundles all the mappings it can do at
> once into a single ioctl.

Please don't tell me you make multiple mappings in a single IOCTL, cause 
that would most likely be a show stopper as well.

See operations like this need to be transactional, e.g. either they 
complete or they return an error and don't do anything.

When you make multiple changes at the same time you either need to make 
sure that you fulfill all prerequisites before making the first change 
or you need to be able to rollback changes when a problem happens.

For an example how to do this see amdgpu_vm_bo_clear_mappings():
>         /* Allocate all the needed memory */
>         before = kzalloc(sizeof(*before), GFP_KERNEL);
>         if (!before)
>                 return -ENOMEM;
>         INIT_LIST_HEAD(&before->list);
>
>         after = kzalloc(sizeof(*after), GFP_KERNEL);
>         if (!after) {
>                 kfree(before);
>                 return -ENOMEM;
>         }
>         INIT_LIST_HEAD(&after->list);

Here we allocate memory for two nodes before actually making any 
changes, just to free it up at the end of the function when it isn't used:
>         /* Insert partial mapping before the range */
>         if (!list_empty(&before->list)) {
>                 amdgpu_vm_it_insert(before, &vm->va);
>                 if (before->flags & AMDGPU_PTE_PRT)
>                         amdgpu_vm_prt_get(adev);
>         } else {
>                 kfree(before);
>         }
>
>         /* Insert partial mapping after the range */
>         if (!list_empty(&after->list)) {
>                 amdgpu_vm_it_insert(after, &vm->va);
>                 if (after->flags & AMDGPU_PTE_PRT)
>                         amdgpu_vm_prt_get(adev);
>         } else {
>                 kfree(after);
>         }

Same is true for locking up handles etc..

> The Thunk API must guarantee that memory can
> be accessed when it returns. So we have no further room for improving
> the pipelining without changing the Thunk API semantics.

Well then change the Thunk API.

Regards,
Christian.

>
> Regards,
>    Felix
>
>> Regards,
>> Christian.



More information about the amd-gfx mailing list