[Intel-gfx] [PATCH] drm/i915/guc: Dynamically alloc GuC descriptor

Chris Wilson chris at chris-wilson.co.uk
Tue Feb 7 20:49:56 UTC 2017


On Tue, Feb 07, 2017 at 01:37:52AM -0800, Oscar Mateo wrote:
> 
> 
> On 02/07/2017 09:25 AM, Chris Wilson wrote:
> >On Tue, Feb 07, 2017 at 12:55:21AM -0800, Oscar Mateo wrote:
> >>
> >>On 02/02/2017 11:33 PM, Chris Wilson wrote:
> >>>On Thu, Feb 02, 2017 at 07:27:45AM -0800, Oscar Mateo wrote:
> >>>>From: Michal Wajdeczko <michal.wajdeczko at intel.com>
> >>>>
> >>>>The GuC descriptor is big in size. If we use local definition of
> >>>>guc_desc we have a chance to overflow stack. Use allocated one.
> >>>>
> >>>>v2: Rebased
> >>>>v3: Split
> >>>>v4: Handle ENOMEM, cover all uses of guc_context_desc, use kzalloc (Oscar)
> >>>>
> >>>>Signed-off-by: Deepak S <deepak.s at intel.com>
> >>>>Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
> >>>>Signed-off-by: Oscar Mateo <oscar.mateo at intel.com>
> >>>>---
> >>>>  drivers/gpu/drm/i915/i915_guc_submission.c | 94 ++++++++++++++++++------------
> >>>>  1 file changed, 57 insertions(+), 37 deletions(-)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>>index 8ced9e2..b4f14f3 100644
> >>>>--- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>>+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>>@@ -102,9 +102,13 @@ static int guc_update_doorbell_id(struct intel_guc *guc,
> >>>>  	struct sg_table *sg = guc->ctx_pool_vma->pages;
> >>>>  	void *doorbell_bitmap = guc->doorbell_bitmap;
> >>>>  	struct guc_doorbell_info *doorbell;
> >>>>-	struct guc_context_desc desc;
> >>>>+	struct guc_context_desc *desc;
> >>>>  	size_t len;
> >>>>+	desc = kzalloc(sizeof(*desc), GFP_KERNEL);
> >>>>+	if (!desc)
> >>>>+		return -ENOMEM;
> >>>>+
> >>>>  	doorbell = client->vaddr + client->doorbell_offset;
> >>>>  	if (client->doorbell_id != GUC_INVALID_DOORBELL_ID &&
> >>>>@@ -116,15 +120,22 @@ static int guc_update_doorbell_id(struct intel_guc *guc,
> >>>>  	}
> >>>>  	/* Update the GuC's idea of the doorbell ID */
> >>>>-	len = sg_pcopy_to_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
> >>>>-			     sizeof(desc) * client->ctx_index);
> >>>>-	if (len != sizeof(desc))
> >>>>+	len = sg_pcopy_to_buffer(sg->sgl, sg->nents, desc, sizeof(*desc),
> >>>>+				 sizeof(*desc) * client->ctx_index);
> >>>This is silly. You are creating a pointer using kmalloc to copy into a
> >>>pointer created using alloc_page. Just write directly into the backing
> >>>store.
> >>I guess I deserve this for not digging any deeper. From what I can
> >>see, the backing store is an array of 1024 context descriptors. If
> >>the whole context descriptor fell in one page, I could kmap_atomic
> >>only that. As it is, I would need to vmap a couple of pages to make
> >>sure I always get a complete pointer to guc_context_desc. Would you
> >>be happy with that?
> >One of the suggested usecases for i915_gem_object_pin_map() was this code.
> >-Chris
> 
> I considered it, but with the current interface that would mean
> vmapping the whole thing (something like 70 pages). Isn't that a bit
> overkill?

The whole object is pinned into memory and occupies aperture space, and
all will be used at some point. Keeping a small vmap is not a huge cost
for a reasonably frequently used object.

> I know you are going to say it wastes memory, but (KISS) what about
> if I make guc_context_desc part of i915_guc_client, to be used for
> sg_pcopy operations?.
> Although I am getting the vibe that you have discussed the sg_pcopy
> thing in the past, and this is not only about avoiding potential
> stack overflows. Am I right?

More that I have an abhorence for scatterlist (since it appears so high
on profiles). At the very least use i915_gem_object_get_page() as that
will use the radixtree for a fast lookup.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list