[Intel-gfx] [PATCH] drm/i915/gt: Consider multi-gt at all places

Upadhyay, Tejas tejas.upadhyay at intel.com
Wed Apr 5 06:56:11 UTC 2023


Sorry for late response. Inline responses below,

> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> Sent: Friday, March 17, 2023 2:46 PM
> To: Upadhyay, Tejas <tejas.upadhyay at intel.com>; Intel-
> GFX at Lists.FreeDesktop.Org
> Subject: Re: [Intel-gfx] [PATCH] drm/i915/gt: Consider multi-gt at all places
> 
> 
> On 17/03/2023 05:52, Tejas Upadhyay wrote:
> > In order to make igt_live_test work in proper way, we need to consider
> > multi-gt in all tests where igt_live_test is used as well as at other
> > random places where multi-gt should be considered.
> 
> Description is a bit vague - is this for Meteorlake in general? What is the
> "proper way" ie what is broken?
> 
> > Cc: Andi Shyti <andi.shyti at linux.intel.com>
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
> > ---
> >   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 13 ++--
> >   .../drm/i915/gem/selftests/i915_gem_context.c | 28 ++++----
> >   drivers/gpu/drm/i915/gt/intel_engine_user.c   |  2 +-
> >   drivers/gpu/drm/i915/gt/selftest_execlists.c  | 68 +++++++++----------
> >   drivers/gpu/drm/i915/selftests/i915_request.c | 36 +++++-----
> >   .../gpu/drm/i915/selftests/igt_live_test.c    | 10 +--
> >   .../gpu/drm/i915/selftests/igt_live_test.h    |  4 +-
> >   7 files changed, 81 insertions(+), 80 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index 9dce2957b4e5..289b75ac39e1 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -2449,9 +2449,9 @@ static int eb_submit(struct i915_execbuffer *eb)
> >   	return err;
> >   }
> >
> > -static int num_vcs_engines(struct drm_i915_private *i915)
> > +static int num_vcs_engines(struct intel_gt *gt)
> >   {
> > -	return hweight_long(VDBOX_MASK(to_gt(i915)));
> > +	return hweight_long(VDBOX_MASK(gt));
> >   }
> >
> >   /*
> > @@ -2459,7 +2459,7 @@ static int num_vcs_engines(struct
> drm_i915_private *i915)
> >    * The engine index is returned.
> >    */
> >   static unsigned int
> > -gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv,
> > +gen8_dispatch_bsd_engine(struct intel_gt *gt,
> >   			 struct drm_file *file)
> >   {
> >   	struct drm_i915_file_private *file_priv = file->driver_priv; @@
> > -2467,7 +2467,7 @@ gen8_dispatch_bsd_engine(struct drm_i915_private
> *dev_priv,
> >   	/* Check whether the file_priv has already selected one ring. */
> >   	if ((int)file_priv->bsd_engine < 0)
> >   		file_priv->bsd_engine =
> > -
> 	get_random_u32_below(num_vcs_engines(dev_priv));
> > +			get_random_u32_below(num_vcs_engines(gt));
> >
> >   	return file_priv->bsd_engine;
> >   }
> > @@ -2644,6 +2644,7 @@ static unsigned int
> >   eb_select_legacy_ring(struct i915_execbuffer *eb)
> >   {
> >   	struct drm_i915_private *i915 = eb->i915;
> > +	struct intel_gt *gt = eb->gt;
> >   	struct drm_i915_gem_execbuffer2 *args = eb->args;
> >   	unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
> >
> > @@ -2655,11 +2656,11 @@ eb_select_legacy_ring(struct i915_execbuffer
> *eb)
> >   		return -1;
> >   	}
> >
> > -	if (user_ring_id == I915_EXEC_BSD && num_vcs_engines(i915) > 1) {
> > +	if (user_ring_id == I915_EXEC_BSD && num_vcs_engines(gt) > 1) {
> >   		unsigned int bsd_idx = args->flags & I915_EXEC_BSD_MASK;
> >
> >   		if (bsd_idx == I915_EXEC_BSD_DEFAULT) {
> > -			bsd_idx = gen8_dispatch_bsd_engine(i915, eb->file);
> > +			bsd_idx = gen8_dispatch_bsd_engine(gt, eb->file);
> >   		} else if (bsd_idx >= I915_EXEC_BSD_RING1 &&
> >   			   bsd_idx <= I915_EXEC_BSD_RING2) {
> >   			bsd_idx >>= I915_EXEC_BSD_SHIFT;
> 
> The hunks above I don't think are correct. Execbuf is in principle based on
> uabi engines, and that is not a per GT concept.
> 
> There is also no functional change above so I can only guess it is a prep work
> for something?

This I think you remove with below patch, so no more discussion required :
commit 927fb9c5ef6ae385c65ae04b181cc2ee94663e28
Author: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Date:   Thu Mar 16 14:27:28 2023 +0000

    drm/i915: Simplify vcs/bsd engine selection

> 
> [snip]
> 
> > -int igt_live_test_end(struct igt_live_test *t)
> > +int igt_live_test_end(struct igt_live_test *t, struct intel_gt *gt)
> >   {
> > -	struct drm_i915_private *i915 = t->i915;
> > +	struct drm_i915_private *i915 = gt->i915;
> >   	struct intel_engine_cs *engine;
> >   	enum intel_engine_id id;
> >
> > @@ -57,7 +57,7 @@ int igt_live_test_end(struct igt_live_test *t)
> >   		return -EIO;
> >   	}
> >
> > -	for_each_engine(engine, to_gt(i915), id) {
> > +	for_each_engine(engine, gt, id) {
> >   		if (t->reset_engine[id] ==
> >   		    i915_reset_engine_count(&i915->gpu_error, engine))
> >   			continue;
> > diff --git a/drivers/gpu/drm/i915/selftests/igt_live_test.h
> > b/drivers/gpu/drm/i915/selftests/igt_live_test.h
> > index 36ed42736c52..209b0548c603 100644
> > --- a/drivers/gpu/drm/i915/selftests/igt_live_test.h
> > +++ b/drivers/gpu/drm/i915/selftests/igt_live_test.h
> > @@ -27,9 +27,9 @@ struct igt_live_test {
> >    * e.g. if the GPU was reset.
> >    */
> >   int igt_live_test_begin(struct igt_live_test *t,
> > -			struct drm_i915_private *i915,
> > +			struct intel_gt *gt,
> >   			const char *func,
> >   			const char *name);
> > -int igt_live_test_end(struct igt_live_test *t);
> > +int igt_live_test_end(struct igt_live_test *t, struct intel_gt *gt);
> 
> Back in the day the plan was that live selftests are device focused and then
> we also have intel_gt_live_subtests, which are obviously GT focused. So in
> that sense adding a single GT parameter to igt_live_test_begin isn't
> something I immediately understand.
> 
> Could you explain in one or two practical examples what is not working
> properly and how is this patch fixing it?

For example you are running test "live_all_engines(void *arg)",

-- Below test begin, will reset counters for primary GT - Tile0 --
err = igt_live_test_begin(&t, to_gt(i915), __func__, "");
        if (err)
                goto out_free;

--- Now we loop for all engines, note here for MTL vcs, vecs engines are not on primary GT or tile 0,
     So counters did not reset on test begin does not cover them. ---
	     
      In test_begin, below will not reset count for vcs, vecs engines on MTL,
	for_each_engine(engine, gt, id)
                t->reset_engine[id] =
                        i915_reset_engine_count(&i915->gpu_error, engine);

--- Then below will end test, again for primary GT where above mentioned engines are not there --- 
err = igt_live_test_end(&t, to_gt(i915));

In short to me it looks like igt_live_test for device needs attention when we have different engines on different GTs like MTL.

Regards,
Tejas
> 
> Regards,
> 
> Tvrtko


More information about the Intel-gfx mailing list