[igt-dev] [RFC PATCH v10 2/6] lib/i915: add gem_engine_topology library

Thu Mar 7 14:16:13 UTC 2019

On 07/03/2019 13:42, Andi Shyti wrote:
> Hi Tvrtko,
> 
>>> +#include "drmtest.h"
>>> +#include "igt_gt.h"
>>> +#include "ioctl_wrappers.h"
>>> +
>>> +#include "i915/gem_engine_topology.h"
>>> +
>>> +#define SIZEOF_CTX_PARAM	offsetof(struct i915_context_param_engines, \
>>> +					class_instance[I915_EXEC_RING_MASK + 1])
>>> +#define SIZEOF_QUERY		offsetof(struct drm_i915_query_engine_info, \
>>> +					engines[I915_EXEC_RING_MASK + 1])
>>
>> There is no relationship between I915_EXEC_RING_MASK and the number of
>> engines we can query. I'll suggest how I think it should work a bit later.
> 
> I915_EXEC_RING_MASK is the upper limit checked in the driver,
> that's why I used that one. I could have hardcoded it to 64, but
> I wanted to keep it consistent to the driver.

I'll argue with Chris to remove this limit from the engine map (come eb3 
might not be relevant any more) - nevertheless the query does not have 
this limit so I think it would be best to decouple the code from the start.

>>> +
>>> +static struct intel_execution_engine2 *intel_active_engines2;
>>> +
>>> +int __gem_query(int fd, struct drm_i915_query *q)
>>> +{
>>> +	int err = 0;
>>> +
>>> +	if (igt_ioctl(fd, DRM_IOCTL_I915_QUERY, q))
>>> +		err = -errno;
>>> +
>>> +	errno = 0;
>>> +	return err;
>>> +}
>>> +
>>> +void gem_query(int fd, struct drm_i915_query *q)
>>> +{
>>> +	igt_assert_eq(__gem_query(fd, q), 0);
>>> +}
>>
>> I would keep these two static if there are no external callers for now.
> 
> Yes, I can make them static, I thought someone might need them in
> the future.

Could be but it will be easy to export them at that point. I wanted to 
keep the patch smallest possible for now so I can focus on reviewing the 
important bits.

> 
>>> +
>>> +static void query_engines(int fd,
>>> +			  struct drm_i915_query_engine_info *query_engines)
>>> +{
>>> +	struct drm_i915_query_item item = { };
>>> +	struct drm_i915_query query = { };
>>> +
>>> +	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
>>> +	query.items_ptr = to_user_pointer(&item);
>>> +	query.num_items = 1;
>>> +	item.length = SIZEOF_QUERY;
>>
>> I think making this function take buffer size and return the queried size
>> would be simplest.
> 
> I haven't really understood, do you mean doing something like:
> 
> static int query_engines(...., int size)
> {
>     [...]
>     return size;
> }
> 
> or calling the gem_query twice, first time for getting the size
> and second time for allocating the necessary resources.
> 
> Because in the first case, we have the size in query_engines, is
> that right?
> 
> While if you mean the second case, this is how I did it at the
> very beginning, but then with Chris we agreed that 64
> (I915_EXEC_RING_MASK) is "big enough". If you are strong with it,
> I can revert it back.

Hm.. what happened to the old Chris!? :)) I am not objecting that 
strongly. Especially if you already changed it once then just leave it. 
It can be changed later if someone gets the desire to tidy..

>>> +
>>> +	item.data_ptr = to_user_pointer(query_engines);
>>> +
>>> +	gem_query(fd, &query);
>>> +}
>>> +
>>> +bool gem_has_engine_topology(void)
>>> +{
>>> +	return intel_active_engines2 != intel_execution_engines2;
>>
>> A problem for the exported API if init_engine_list hasn't been called yet?
>> Maybe:
>>
>> 	if (!intel_active_engines2)
>> 		init_engine_list(fd);
>>
>> 	return intel_active_engines2 != intel_execution_engines2;
>>
>> ?
> 
> Yes, it's indeed prone to be misused, I would need to make a more
> generic way, because the two lines
> 
>     if (!intel_active_engines2)
>            init_engine_list(fd);
> 
> are called so many times that I'm having an overdose of them :)

Yes the call chains and responsibilities do feel a bit unobvious.

> 
>>> +}
>>> +
>>> +static void set_ctx_param_engines(int fd, uint32_t ctx_id)
>>> +{
>>> +	struct i915_context_param_engines *ctx_engine;
>>> +	struct drm_i915_gem_context_param ctx_param;
>>> +	const struct intel_execution_engine2 *e2;
>>> +	uint8_t buff[SIZEOF_CTX_PARAM] = { };
>>> +	int i;
>>> +
>>> +	if (!gem_has_engine_topology())
>>> +		return;
>>
>> AFAICS this can be assert here.
> 
> I think this check is redundant as it's never going to happen.
> This function is 'static' and the calling function checks that
> already. I will remove it.

Better to have it as assert but as you wish. I guess even without an 
assert it would segfault when it tries to iterate intel_active_engines2.

> 
>>> +
>>> +	ctx_engine = (struct i915_context_param_engines *) buff;
>>> +
>>> +	ctx_param.ctx_id = ctx_id;
>>> +	ctx_param.param = I915_CONTEXT_PARAM_ENGINES;
>>> +
>>> +	ctx_engine->extensions = 0;
>>> +	for (i = 0, e2 = intel_active_engines2; e2->name; i++, e2++) {
>>
>> Here I think is the place where we check that the total number of engines
>> does not exceed the engine map execbuf index.
>>
>> 	igt_require(i <= I915_EXEC_RING_MASK);
>>
>> ?
> 
> I can add that, but it would be redundant as the list would have
> never been created otherwise... anyway, better be safe than sorry :)

Yes I know, I wrote that because I was suggesting to remove the 64 
engine limit from the query and just have it in the engine map.

But as said if you already discussed that point then leave it.

> BTW, given my natural confusion between 'require' and 'assert',
> you meant 'assert', right?

I meant require. But I don't know either, it's so hard to choose 
sometimes. :)

>>> +		ctx_engine->class_instance[i].engine_class = e2->class;
>>> +		ctx_engine->class_instance[i].engine_instance = e2->instance;
>>> +	}
>>> +
>>> +	ctx_param.size = offsetof(typeof(*ctx_engine), class_instance[i + 1]);
>>> +	ctx_param.value = to_user_pointer(ctx_engine);
>>> +
>>> +	gem_context_set_param(fd, &ctx_param);
>>> +}
>>> +
>>> +/*
>>> + * Initializes the list of engines.
>>> + *
>>> + * Returns:
>>> + *
>>> + *  - 0 in case of success and the get/set_param ioctls are implemented
>>> + *  - -ENODEV in case of success but I915_CONTEXT_PARAM_ENGINES is not
>>> + *    implemented in DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM ioctl.
>>> + */
>>> +static int init_engine_list(int fd)
>>> +{
>>> +	const char *class_names[] = { "rcs", "bcs", "vcs", "vecs" };
>>> +	struct drm_i915_query_engine_info *query_engine;
>>> +	unsigned char query_buffer[SIZEOF_QUERY] = { };
>>> +	struct drm_i915_gem_context_param ctx_param = {
>>> +		.param = I915_CONTEXT_PARAM_ENGINES,
>>> +	};
>>> +	int i;
>>> +
>>> +	/* the list is already initialized */
>>> +	if (intel_active_engines2)
>>> +		return gem_has_engine_topology() ? 0 : -ENODEV;
>>> +
>>> +	/*
>>> +	 * We check first whether the I915_CONTEXT_PARAM_ENGINES parameter is
>>> +	 * supported by the running kernel. If not, __gem_context_get_param()
>>> +	 * will return -EINVAL which, at this point, is not necessarily a
>>> +	 * failure but it means that we need to fall beck to polling the engines
>>> +	 * directly from intel_execution_engines2[].
>>> +	 *
>>> +	 * We will return -ENODEV with the meaning of missing interface
>>> +	 */
>>> +	if (__gem_context_get_param(fd, &ctx_param)) {
>>> +		intel_active_engines2 = intel_execution_engines2;
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	query_engine = (struct drm_i915_query_engine_info *) query_buffer;
>>> +	query_engines(fd, query_engine);
>>
>> Do a two stage query here - first ask for required buffer size, then
>> allocate it and query:
>>
>> 	size = query_engines(fd, NULL, 0);
>> 	query_engine = malloc(size);
>> 	query_engines(fd, query_engine, size);
> 
> As above :) this is how I did it at the beginning and with Chris
> we agreed that 64 is big enough, and then we had some review
> rounds on agreeing on buffer size, name, calculation and so
> on.
> 
> Shall I move back? Honestly I like the double call, as well.

Leave it at least for now then.

>>> +
>>> +	igt_assert(query_engine->num_engines < I915_EXEC_RING_MASK + 1);
>>
>> Remove this since we don't need to limit the size of the array during the
>> query. It is only the engine map that has the limitation.
> 
> This came after one of the 10 rounds of review as well :)
> It's indeed a bit redundant, It's mainly for the malloc,
> to avoid crazy allocation in case something goes wrong.
> 
>>> +
>>> +	intel_active_engines2 = malloc((query_engine->num_engines + 1) *
>>> +				       sizeof(*intel_active_engines2));
>>> +	igt_assert(intel_active_engines2);
>>> +
>>> +	for (i = 0; i < query_engine->num_engines; i++) {
>>> +		char *name;
>>> +		int class = query_engine->engines[i].engine_class;
>>> +		int instance = query_engine->engines[i].engine_instance;
>>
>> Could use __u16 to match uapi.
> 
> Sure!
> 
>>> +
>>> +		intel_active_engines2[i].class = class;
>>> +		intel_active_engines2[i].instance = instance;
>>> +
>>> +		/* if we don't recognise the class, then we mark it as "unk" */
>>> +		if (class >= ARRAY_SIZE(class_names) || !class_names[class])
>>
>> Second condition seems to be not needed at the moment.
> 
> Yes, of course, it must be a leftover.
> 
>>> +struct intel_execution_engine2 *get_active_engines(int fd)
>>
>> Unexport this one?
> 
> I thought someone might need it, I will do it static and rework
> it with the bool gem_has_engine_topology() as they would become
> very similar.
> 
>>> +{
>>> +	if (init_engine_list(fd)) {
>>> +		igt_info("using pre-allocated engine list\n");
>>> +		return NULL;
>>> +	}
>>> +
>>> +	return intel_active_engines2;
>>> +}
>>> +
>>> +struct intel_execution_engine2 *gem_set_context_get_engines(int fd,
>>> +							    uint32_t ctx_id)
>>
>> I find the name of this function really confusing (set get what?). Can I
>> suggest __gem_prepare_context_engines or something like that?
> 
> The name means "set context and get engines", as it's a "get_"
> kind of function. But as it came out, I have a very limited
> imagination for names choice. I will call it
> __gem_prepare_context_engines(), then.
> 
>>> +{
>>> +	struct intel_execution_engine2 *active_engines = get_active_engines(fd);
>>> +
>>> +	if (!active_engines)
>>> +		return intel_execution_engines2;
>>
>> So without engine discovery the function will not configure the engine map.
>> How will the code inside for_each_engine2 be able to work then?
> 
> It's this part of the for_each_engine2 that would make it work:
> 
>   for_if (gem_has_engine_topology() || \
>           gem_has_engine(fd, e2__->class, e2__->instance))
> 
> We had quite some discussions about backward compatibility and
> the whole thing is designed to be backward compatible, otherwise,
> I would have removed quite some code from the above.

Yes, I comment on that in the other reply.

>> I guess it's okay since we don't expect to see a kernel with ctx engine map
>> support and no engine discovery but it feels a bit unnatural.
>>
>> But I think it effectively means you should have igt_require_gem_engine_list
>> in here since it is otherwise not usable from the for_each_engine2 iterator.
>>
>>> +
>>> +	set_ctx_param_engines(fd, ctx_id);
>>
>> You need to handle the use case where the context already has a map
>> configured instead of overriding it. Should be easy.
> 
> Yes, this is relevant to your comment above, about the
> igt_require(<new_api>). This part is checked in
> 'get_active_engines(fd)', that returns NULL if we do not have the
> new uapi and therefore it returns the old list of engines.
> 
> adding the igt_require() (asset?) would be redundant.
> Am I missing any use case?

AFAIR Chris wanted it to support the use case where he configures the 
engine map and passes the context to for_each_engine2. Maybe he changed 
his mind?

It would be tricky to do since then there is no intel_active_engine 
array to walk.. Could only define the iterator that the pointer to 
engine is sometimes not valid (must be checked) and added index used to 
submit. Not sure, depends if Chris still wants this feature.

>>> +#ifndef GEM_QUERY_H
>>> +#define GEM_QUERY_H
>>
>> GEM_ENGINE_TOPOLOGY_H
> 
> Yes :) This library was affected by an extensive name changing,
> due to my lack of imagination :)
> 
>>> -const struct intel_execution_engine2 intel_execution_engines2[] = {
>>> +struct intel_execution_engine2 intel_execution_engines2[] = {
>>
>> This one can't stay const?
> 
> Yes, I hate this bit as well! I removed here the const, because I
> wanted active_engines2 to point to execution_engine2.
> 
> active_engines2 cannot be const, so that I cannot make it point
> to this one, therefore I have to remove the const from here.
> 
> But, indeed, because now I am using the "get_engine...()" and
> because I am going to rework the 'bool has_engine_topology()'
> function, approach, I might not need anymore to refer to this
> one.

Hm yes, the const hell.. see what's easiest then.

> What will happen, though, is that we would then need, at some
> point, to patch all the tests that refer to
> intel_execution_engines2.

I think keep the big changes to minimum for this series at this stage 
and we just try to make it work good enough for the media scalability 
requirement.

Regards,

Tvrtko