[PATCH 18/22] drm/amdgpu: add flag for high priority contexts v4

Fri Mar 3 14:48:07 UTC 2017

On 2 March 2017 at 03:52, Andres Rodriguez <andresx7 at gmail.com> wrote:
>
>
> On 2017-02-28 08:13 PM, Emil Velikov wrote:
>>
>> Hi Andres,
>>
>> There's a couple of nitpicks below, but feel free to address those as
>> follow-up. Considering they're correct of course ;-)
>
>
> As much as I'd like the to let future me deal with those issues, the UAPI
> behavior is something I'd like to get nailed down early and avoid changing.
>
> So any nitpicks here are more than welcome now (better than later :) )
>
>>
>> On 28 February 2017 at 22:14, Andres Rodriguez <andresx7 at gmail.com> wrote:
>>>
>>> Add a new context creation parameter to express a global context
>>> priority.
>>>
>>> Contexts allocated with AMDGPU_CTX_PRIORITY_HIGH will receive higher
>>> priority to scheduler their work than AMDGPU_CTX_PRIORITY_NORMAL
>>> (default) contexts.
>>>
>>> v2: Instead of using flags, repurpose __pad
>>> v3: Swap enum values of _NORMAL _HIGH for backwards compatibility
>>> v4: Validate usermode priority and store it
>>>
>>> Signed-off-by: Andres Rodriguez <andresx7 at gmail.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  1 +
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c       | 41
>>> +++++++++++++++++++++++----
>>>  drivers/gpu/drm/amd/scheduler/gpu_scheduler.h |  1 +
>>>  include/uapi/drm/amdgpu_drm.h                 |  7 ++++-
>>>  4 files changed, 44 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index e30c47e..366f6d3 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -664,20 +664,21 @@ struct amdgpu_ctx_ring {
>>>         struct amd_sched_entity entity;
>>>  };
>>>
>>>  struct amdgpu_ctx {
>>>         struct kref             refcount;
>>>         struct amdgpu_device    *adev;
>>>         unsigned                reset_counter;
>>>         spinlock_t              ring_lock;
>>>         struct dma_fence        **fences;
>>>         struct amdgpu_ctx_ring  rings[AMDGPU_MAX_RINGS];
>>> +       int                     priority;
>>>         bool preamble_presented;
>>>  };
>>>
>>>  struct amdgpu_ctx_mgr {
>>>         struct amdgpu_device    *adev;
>>>         struct mutex            lock;
>>>         /* protected by lock */
>>>         struct idr              ctx_handles;
>>>  };
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> index 400c66b..22a15d6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> @@ -18,47 +18,75 @@
>>>   * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>>>   * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>>>   * OTHER DEALINGS IN THE SOFTWARE.
>>>   *
>>>   * Authors: monk liu <monk.liu at amd.com>
>>>   */
>>>
>>>  #include <drm/drmP.h>
>>>  #include "amdgpu.h"
>>>
>>> -static int amdgpu_ctx_init(struct amdgpu_device *adev, struct amdgpu_ctx
>>> *ctx)
>>> +static enum amd_sched_priority amdgpu_to_sched_priority(int
>>> amdgpu_priority)
>>> +{
>>> +       switch (amdgpu_priority) {
>>> +       case AMDGPU_CTX_PRIORITY_HIGH:
>>> +               return AMD_SCHED_PRIORITY_HIGH;
>>> +       case AMDGPU_CTX_PRIORITY_NORMAL:
>>> +               return AMD_SCHED_PRIORITY_NORMAL;
>>> +       default:
>>> +               WARN(1, "Invalid context priority %d\n",
>>> amdgpu_priority);
>>> +               return AMD_SCHED_PRIORITY_NORMAL;
>>> +       }
>>> +}
>>> +
>>> +static int amdgpu_ctx_init(struct amdgpu_device *adev,
>>> +                               uint32_t priority,
>>> +                               struct amdgpu_ctx *ctx)
>>>  {
>>>         unsigned i, j;
>>>         int r;
>>> +       enum amd_sched_priority sched_priority;
>>> +
>>> +       sched_priority = amdgpu_to_sched_priority(priority);
>>> +
>>
>> This will trigger dmesg spam on normal user input. I'd keep the WARN
>> in amdgpu_to_sched_priority, but move the function call after the
>> validation of priority.
>> Thinking about it the input validation really belongs in the ioctl -
>> amdgpu_ctx_ioctl().
>>
>
> Agreed.
>
>>> +       if (priority >= AMDGPU_CTX_PRIORITY_NUM)
>>> +               return -EINVAL;
>>> +
>>> +       if (sched_priority < 0 || sched_priority >=
>>> AMD_SCHED_MAX_PRIORITY)
>>> +               return -EINVAL;
>>> +
>>> +       if (sched_priority == AMD_SCHED_PRIORITY_HIGH &&
>>> !capable(CAP_SYS_ADMIN))
>>
>> This is not obvious neither in the commit message nor the UAPI. I'd
>> suggest adding a comment in the latter.
>
>
> Will do.
>
>> If memory is not failing - high prio will _not_ work with render nodes
>> so you really want to cover and/or explain why.
>
>
> High priority will work fine with render nodes. I'm testing using radv which
> uses render nodes actually.
>
> But I've had my fair share of two bugs canceling each other out. So if you
> do have some insight on why this is the case, let me know.
>
Right - got confused by the CAP_SYS_ADMIN cap. What I meant above is -
why do we need it, does it impose specific workflow, permissions for
the userspace to use high prio ?
This is the type of thing, I think, must be documented in the UAPI header.

>>
>>> +               return -EACCES;
>>>
>>>         memset(ctx, 0, sizeof(*ctx));
>>>         ctx->adev = adev;
>>> +       ctx->priority = priority;
>>>         kref_init(&ctx->refcount);
>>>         spin_lock_init(&ctx->ring_lock);
>>>         ctx->fences = kcalloc(amdgpu_sched_jobs * AMDGPU_MAX_RINGS,
>>>                               sizeof(struct dma_fence*), GFP_KERNEL);
>>>         if (!ctx->fences)
>>>                 return -ENOMEM;
>>>
>>>         for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>>                 ctx->rings[i].sequence = 1;
>>>                 ctx->rings[i].fences = &ctx->fences[amdgpu_sched_jobs *
>>> i];
>>>         }
>>>
>>>         ctx->reset_counter = atomic_read(&adev->gpu_reset_counter);
>>>
>>>         /* create context entity for each ring */
>>>         for (i = 0; i < adev->num_rings; i++) {
>>>                 struct amdgpu_ring *ring = adev->rings[i];
>>>                 struct amd_sched_rq *rq;
>>>
>>> -               rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_NORMAL];
>>> +               rq = &ring->sched.sched_rq[sched_priority];
>>>                 r = amd_sched_entity_init(&ring->sched,
>>> &ctx->rings[i].entity,
>>>                                           rq, amdgpu_sched_jobs);
>>>                 if (r)
>>>                         goto failed;
>>>         }
>>>
>>>         return 0;
>>>
>>>  failed:
>>>         for (j = 0; j < i; j++)
>>> @@ -83,39 +111,41 @@ static void amdgpu_ctx_fini(struct amdgpu_ctx *ctx)
>>>         kfree(ctx->fences);
>>>         ctx->fences = NULL;
>>>
>>>         for (i = 0; i < adev->num_rings; i++)
>>>                 amd_sched_entity_fini(&adev->rings[i]->sched,
>>>                                       &ctx->rings[i].entity);
>>>  }
>>>
>>>  static int amdgpu_ctx_alloc(struct amdgpu_device *adev,
>>>                             struct amdgpu_fpriv *fpriv,
>>> +                           uint32_t priority,
>>>                             uint32_t *id)
>>>  {
>>>         struct amdgpu_ctx_mgr *mgr = &fpriv->ctx_mgr;
>>>         struct amdgpu_ctx *ctx;
>>>         int r;
>>>
>>>         ctx = kmalloc(sizeof(*ctx), GFP_KERNEL);
>>>         if (!ctx)
>>>                 return -ENOMEM;
>>>
>>>         mutex_lock(&mgr->lock);
>>>         r = idr_alloc(&mgr->ctx_handles, ctx, 1, 0, GFP_KERNEL);
>>>         if (r < 0) {
>>>                 mutex_unlock(&mgr->lock);
>>>                 kfree(ctx);
>>>                 return r;
>>>         }
>>> +
>>>         *id = (uint32_t)r;
>>> -       r = amdgpu_ctx_init(adev, ctx);
>>> +       r = amdgpu_ctx_init(adev, priority, ctx);
>>>         if (r) {
>>>                 idr_remove(&mgr->ctx_handles, *id);
>>>                 *id = 0;
>>>                 kfree(ctx);
>>>         }
>>>         mutex_unlock(&mgr->lock);
>>>         return r;
>>>  }
>>>
>>>  static void amdgpu_ctx_do_release(struct kref *ref)
>>> @@ -179,32 +209,33 @@ static int amdgpu_ctx_query(struct amdgpu_device
>>> *adev,
>>>         ctx->reset_counter = reset_counter;
>>>
>>>         mutex_unlock(&mgr->lock);
>>>         return 0;
>>>  }
>>>
>>>  int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
>>>                      struct drm_file *filp)
>>>  {
>>>         int r;
>>> -       uint32_t id;
>>> +       uint32_t id, priority;
>>>
>>>         union drm_amdgpu_ctx *args = data;
>>>         struct amdgpu_device *adev = dev->dev_private;
>>>         struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>
>>>         r = 0;
>>>         id = args->in.ctx_id;
>>> +       priority = args->in.priority;
>>>
>> Hmm we don't seem to be doing any in.flags validation - not cool.
>> Someone seriously wants to add that and check the remaining ioctls.
>> At the same time - I think you want to add a flag bit "HAS_PRIORITY"
>> [or similar] and honour in.priority only when that is set.
>>
>> Even if the USM drivers are safe, this will break on a poor soul that
>> is learning how to program their GPU. "My program was running before -
>> I updated the kernel and it no longer does :-("
>
>
> Improving validation of already existing parameters is probably something
> better served in a separate series (so in this case I will take you up on
> the earlier offer)
>
Agreed.

> As far as a HAS_PRIORITY flag goes, I'm not sure it would really be
> necessary. libdrm_amdgpu zeroes their data structures since its first commit
> (0936139), so the change should be backwards compatible with all
> libdrm_amdgpu versions.
>
> If someone is bypassing libdrm_amdgpu and hitting the IOCTLs directly, then
> I hope they know what they are doing. The only apps I've really seen do this
> are fuzzers, and they probably wouldn't care.
>
> I'm assuming we do have the flexibility of relying on the usermode library
> as the point of the backwards compatibility.
>
Since there is no validation in the first place anyone can feed
garbage into the ioctl without knowing that they're doing it wrong.
Afaict the rule tends to be - kernel updates should never break
userspace.
And by the time you realise there is one here, it'll be too late.

An even if we ignore all that for a moment, flags is there exactly for
things like HAS_PRIORITY and adding/managing is trivial.
Then again, if you feel so strongly against ~5 lines of code, so be it :-(

Thanks
Emil