[PATCH v3 1/2] drm/amdgpu: Add a max ibs per submission limit.
Christian König
ckoenig.leichtzumerken at gmail.com
Wed Apr 12 14:11:57 UTC 2023
Am 12.04.23 um 14:50 schrieb Bas Nieuwenhuizen:
> On Wed, Apr 12, 2023 at 2:46 PM Christian König
> <ckoenig.leichtzumerken at gmail.com> wrote:
>> Am 12.04.23 um 14:18 schrieb Bas Nieuwenhuizen:
>>> And ensure each ring supports that many submissions. This makes
>>> sure that we don't get surprises after the submission has been
>>> scheduled where the ring allocation actually gets rejected.
>>>
>>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2498
>>> Signed-off-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +++
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 25 ++++++++++++++++++++++++
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 +
>>> 3 files changed, 29 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index 7af3041ccd0e..8362738974c8 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -110,6 +110,9 @@ static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
>>> if (r < 0)
>>> return r;
>>>
>>> + if (num_ibs[r] >= amdgpu_ring_max_ibs(chunk_ib->ip_type))
>>> + return -EINVAL;
>>> +
>>> ++(num_ibs[r]);
>>> p->gang_leader_idx = r;
>>> return 0;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index dc474b809604..abd70d2f26f6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -49,6 +49,25 @@
>>> * them until the pointers are equal again.
>>> */
>>>
>>> +/**
>>> + * amdgpu_ring_max_ibs - Return max IBs that fit in a single submission.
>>> + *
>>> + * @type: ring type for which to return the limit.
>>> + */
>>> +unsigned int amdgpu_ring_max_ibs(enum amdgpu_ring_type type)
>>> +{
>>> + switch (type) {
>>> + case AMDGPU_RING_TYPE_GFX:
>>> + case AMDGPU_RING_TYPE_COMPUTE:
>>> + /* gfx/compute are often used more extensively and radv
>>> + * has historically assumed the limit was 192.
>>> + */
>>> + return 192;
>>> + default:
>>> + return 50;
>>> + }
>>> +}
>>> +
>>> /**
>>> * amdgpu_ring_alloc - allocate space on the ring buffer
>>> *
>>> @@ -182,6 +201,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
>>> int sched_hw_submission = amdgpu_sched_hw_submission;
>>> u32 *num_sched;
>>> u32 hw_ip;
>>> + unsigned int max_ibs_dw;
>>>
>>> /* Set the hw submission limit higher for KIQ because
>>> * it's used for a number of gfx/compute tasks by both
>>> @@ -290,6 +310,11 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
>>> return r;
>>> }
>>>
>>> + max_ibs_dw = ring->funcs->emit_frame_size +
>>> + amdgpu_ring_max_ibs(ring->funcs->type) * ring->funcs->emit_ib_size;
>>> + max_ibs_dw = (max_ibs_dw + ring->funcs->align_mask) & ~ring->funcs->align_mask;
>>> + max_dw = max(max_dw, max_ibs_dw);
>> I think something like "if (WARN_ON(max_ibs_dw > max_dw) max_dw =
>> max_ibs_dw;" would be more appropriate.
> I really like it this way because it automatically does the right
> thing. In comparison it is difficult getting the test matrix together
> for a WARN_ON on something that differs every generation, and I don't
> know what we'd gain from doing it that way?
Especially the older hw generations are quite limited in the size of the
ring buffer.
For example we had to split page table updates into smaller submissions
(and later moved it into IBs) because the SDMA couldn't handle a large
ring buffer (the limit was just 1MiB or something like that IIRC).
I want to avoid that we unintentionally increase the submission limit to
more than some hw generation can do.
Regards,
Christian.
>
>
>> Apart from that this looks really good to me,
>> Christian.
>>
>>> +
>>> ring->ring_size = roundup_pow_of_two(max_dw * 4 * sched_hw_submission);
>>>
>>> ring->buf_mask = (ring->ring_size / 4) - 1;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index 3989e755a5b4..e6e672727529 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -317,6 +317,7 @@ struct amdgpu_ring {
>>> #define amdgpu_ring_patch_cond_exec(r,o) (r)->funcs->patch_cond_exec((r),(o))
>>> #define amdgpu_ring_preempt_ib(r) (r)->funcs->preempt_ib(r)
>>>
>>> +unsigned int amdgpu_ring_max_ibs(enum amdgpu_ring_type type);
>>> int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw);
>>> void amdgpu_ring_ib_begin(struct amdgpu_ring *ring);
>>> void amdgpu_ring_ib_end(struct amdgpu_ring *ring);
More information about the amd-gfx
mailing list