[Intel-gfx] [PATCH] drm/i915: Change semantics of context isolation reporting to UM
Adrian Larumbe
adrian.larumbe at collabora.com
Wed May 4 11:50:05 UTC 2022
Hi, Tvrtko
On 03.05.2022 15:44, Tvrtko Ursulin wrote:
>
> Hi,
>
> On 29/04/2022 16:11, Adrian Larumbe wrote:
> > I915_PARAM_HAS_CONTEXT_ISOLATION was already being used as a boolean by
> > both Iris and Vulkan , and stood for the guarantee that, when creating a
> > new context, all state set by it will not leak to any other context.
> >
> > However the actual return value was a bitmask where every bit stood for an
> > initialised engine, and IGT test gem_ctx_isolation makes use of this mask
> > for deciding on the actual context engine isolation status.
> >
> > However, we do not provide UAPI for IGT tests, so the value returned by the
> > PARAM ioctl has to reflect Mesa usage as a boolean.
>
> a)
> I suggest splitting into two patches. One changes the semantics to boolean, second one changes it for RCS+CCS.
>
> b)
> What about test coverage - both for platforms with RCS+CSS (media and blitter stop being tested - all coverage is gone basically) and also for pre Gen8 platforms, are there failures expected there? (Test will try to run on some engines which do not support isolation now, no?)
Do you mean IGT tests? I think gem_ctx_isolation.c has to be rewritten so that
the engine isolation status of the different devices is somehow hard-coded
into the test, perhaps something like the intel_device_info struct in the
kernel.
> This change only made sense after compute engine support was added to the
> driver in commit 944823c9463916dd53f3 ("drm/i915/xehp: Define compute class
> and engine") because no context isolation can be assumed on any device with
> both RCS annd CCS engines.
>Isn't it an arbitrary decision when thinking about other engine classes (media, blitter) on those platforms? Commit message should be clear in that respect.
I think the change was required because the addition of a CCS engine broke
pre-existing assumptions about engine context isolation, but only when
coexisting with an RCS engine in the same device. I wouldn't know about other
engines, but I'll ask around.
> Also, looking at iris:
>
> if (iris_getparam_integer(fd, I915_PARAM_HAS_CONTEXT_ISOLATION) <= 0) {
> debug_error("Kernel is too old for Iris. Consider upgrading to kernel v4.16.\n");
> return NULL;
> }
>
> Won't this make Iris fail on RCS+CCS platforms - or I need to look at a newer branch/pull request? What is the plan there?
Yes, I think I misread this code and didn't realise, when there isn't context
isolation, this snippet will fail. However, given the semantics of it, which I
glean from the error message between the brackets, I'd say context isolation not
being present shouldn't cause it to fail. So I guess it could be rewritten as
if (iris_getparam_integer(fd, I915_PARAM_HAS_CONTEXT_ISOLATION) < 0) {
> Signed-off-by: Adrian Larumbe <adrian.larumbe at collabora.com>
> ---
> drivers/gpu/drm/i915/gt/intel_engine_user.c | 13 ++++++++++++-
> drivers/gpu/drm/i915/gt/intel_engine_user.h | 1 +
> drivers/gpu/drm/i915/i915_drm_client.h | 2 +-
> drivers/gpu/drm/i915/i915_getparam.c | 2 +-
> include/uapi/drm/i915_drm.h | 14 +++-----------
> 5 files changed, 18 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> index 0f6cd96b459f..2d6bd36d6150 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> @@ -47,7 +47,7 @@ static const u8 uabi_classes[] = {
> [COPY_ENGINE_CLASS] = I915_ENGINE_CLASS_COPY,
> [VIDEO_DECODE_CLASS] = I915_ENGINE_CLASS_VIDEO,
> [VIDEO_ENHANCEMENT_CLASS] = I915_ENGINE_CLASS_VIDEO_ENHANCE,
> - /* TODO: Add COMPUTE_CLASS mapping once ABI is available */
> + [COMPUTE_CLASS] = I915_ENGINE_CLASS_COMPUTE,
> > I think this landed so you will need to rebase.
>
> > };
> > static int engine_cmp(void *priv, const struct list_head *A,
> > @@ -306,3 +306,14 @@ unsigned int intel_engines_has_context_isolation(struct drm_i915_private *i915)
> > return which;
> > }
> > +
> > +bool intel_cross_engine_isolated(struct drm_i915_private *i915)
>
> Naming feels lackluster (Intel what?). Do you expect other callers or could just implement the check in i915_getparam.c, inside the switch statement?
No other callers, so I guess I should do like you suggested below, handle the logic straightaway from i915_getparam_ioctl.
> +{
> + unsigned int which = intel_engines_has_context_isolation(i915);
> +
> + if ((which & BIT(I915_ENGINE_CLASS_RENDER)) &&
> + (which & BIT(I915_ENGINE_CLASS_COMPUTE)))
> + return false;
> +
> + return !!which;
> You could first just check if there are any RCS and CCS engines (for instance i915->engine_uabi_class_count[], or RCS/CCS_MASK()).
>
> /* Comment here to explain the decision. */
> if (RCS_MASK(&i915->gt) | CCS_MASK(&i915->gt))
> value = 0;
> else
> value = !!intel_engines_has_context_isolation(i915);
>
> ?
>
> There also may be little point in even calling intel_engines_has_context_isolation, when the desired output is a boolean and could just make it feature or Gen based. Decision for later though, after some clarifications.
> +}
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.h b/drivers/gpu/drm/i915/gt/intel_engine_user.h
> index 3dc7e8ab9fbc..ff21349db4d4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_user.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.h
> @@ -15,6 +15,7 @@ struct intel_engine_cs *
> intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
> unsigned int intel_engines_has_context_isolation(struct drm_i915_private *i915);
> +bool intel_cross_engine_isolated(struct drm_i915_private *i915);
> void intel_engine_add_user(struct intel_engine_cs *engine);
> void intel_engines_driver_register(struct drm_i915_private *i915);
> diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
> index 5f5b02b01ba0..f796c5e8e060 100644
> --- a/drivers/gpu/drm/i915/i915_drm_client.h
> +++ b/drivers/gpu/drm/i915/i915_drm_client.h
> @@ -13,7 +13,7 @@
> #include "gt/intel_engine_types.h"
> -#define I915_LAST_UABI_ENGINE_CLASS I915_ENGINE_CLASS_VIDEO_ENHANCE
> +#define I915_LAST_UABI_ENGINE_CLASS I915_ENGINE_CLASS_COMPUTE
> struct drm_i915_private;
> diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
> index c12a0adefda5..3d5120d2d78a 100644
> --- a/drivers/gpu/drm/i915/i915_getparam.c
> +++ b/drivers/gpu/drm/i915/i915_getparam.c
> @@ -145,7 +145,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
> value = 1;
> break;
> case I915_PARAM_HAS_CONTEXT_ISOLATION:
> - value = intel_engines_has_context_isolation(i915);
> + value = intel_cross_engine_isolated(i915);
> break;
> case I915_PARAM_SLICE_MASK:
> value = sseu->slice_mask;
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 35ca528803fd..84c0af77cc1f 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -166,6 +166,7 @@ enum drm_i915_gem_engine_class {
> I915_ENGINE_CLASS_COPY = 1,
> I915_ENGINE_CLASS_VIDEO = 2,
> I915_ENGINE_CLASS_VIDEO_ENHANCE = 3,
> + I915_ENGINE_CLASS_COMPUTE = 4,
> /* should be kept compact */
> @@ -635,17 +636,8 @@ typedef struct drm_i915_irq_wait {
> #define I915_PARAM_HAS_EXEC_FENCE_ARRAY 49
> /*
> - * Query whether every context (both per-file default and user created) is
> - * isolated (insofar as HW supports). If this parameter is not true, then
> - * freshly created contexts may inherit values from an existing context,
> - * rather than default HW values. If true, it also ensures (insofar as HW
> - * supports) that all state set by this context will not leak to any other
> - * context.
> - *
> - * As not every engine across every gen support contexts, the returned
> - * value reports the support of context isolation for individual engines by
> - * returning a bitmask of each engine class set to true if that class supports
> - * isolation.
> + * Query whether the device can make cross-engine isolation guarantees for
> + * all the engines whose default state has been initialised.
> "Engine whose default state has been initialised" does not sound very helpful for userspace developers. Like what determines if that has happened or not, and the fact userspace is more about context state rather than engine state.
>
> Existing text which talked about engines supporting contexts and not leaking state sounded better TBF. So overall I think you deleted too much of the text. Can't you just change the ending, from the point where it goes into individual engines and bitmap, replacing with the new explanation?
I made a mistake here. I assumed mentioning this would be relevant because
engine isolation is only checked for engines whose default state had been
assigned in __engines_record_defaults, but like you said this is a KM detail
userspace developers shouldn't concern themselves with, so I'll restore that
part of the previous text.
> Regards,
>
> Tvrtko
> */
> #define I915_PARAM_HAS_CONTEXT_ISOLATION 50
Kind Regards,
Adrian Larumbe
More information about the Intel-gfx
mailing list