[PATCH i-g-t v7] tests/intel/gem_exec_capture: Fix many-* subtests

Kamil Konieczny kamil.konieczny at linux.intel.com
Wed Apr 10 16:38:20 UTC 2024


Hi Peter,
On 2024-04-10 at 07:40:30 +0200, Peter Senna Tschudin wrote:
> Currently trying to run `gem_exec_capture --run-subtest
> many-4K-incremental` or `gem_exec_capture --run-subtest many-4K-zero`
> will fail with:
> 
>  (gem_exec_capture:81999) i915/gem_engine_topology-CRITICAL: Test
>  assertion failure function gem_engine_properties_configure, file
>  ../lib/i915/gem_engine_topology.c:577:
>  (gem_exec_capture:81999) i915/gem_engine_topology-CRITICAL: Failed assertion: ret == 1
>  (gem_exec_capture:81999) i915/gem_engine_topology-CRITICAL: Last errno: 9, Bad file descriptor
>  (gem_exec_capture:81999) i915/gem_engine_topology-CRITICAL: error: -1 != 1
> 
> This problem happens inside the macro find_first_available_engine()
> when:
>  1. for_each_ctx_engine() allocates an struct intel_engine_data 'ed'
>     inside a for loop. The core of the issue is that ed only exists
>     inside the for loop. As soon as the for loop ends, ed is out of scope
>     and after it's lifetime.
>  2. intel_get_current_engine() sets '*e' to an address of ed. This is ok
>     while inside the for loop, and is undefined behavior after the for
>     loop ends.
>  3. configure_hangs() uses '*e' after the lifetime of 'ed' has ended
>     leading to undefined behavior
>  4. After the call to find_first_available_engine() __captureN() will
>     fail as it expects '*e' to be valid. This is also undefined
>     behavior.

That is good explanation.

> 
> This patch fixes the issue by:
>  1. Creating a 'tmpe' variable for code clarity by hinting that this is a
>     temporal variable.
>  1. Moving the call to configure_hangs() to inside the for loop, where
>     there are no issues with scope and lifetime of 'ed'.
>  2. Assigning 'e' to a value of 'saved' with 'e = &saved.engine;'. The
>     reason why this works is twofold: First 'saved' has the same content
>     as 'e' had when there were no variable scope and lifetime issues.
>     Second both 'e' and 'saved' are defined in many() meaning that they
>     share the same scope and lifetime.

Skip this second explanation points starting from 'Creating tmpe' as it can
be readed from patch, just write that using 'e' outside for_each_ctx_engine()
is incorrect and this patch fixes it.

> 
> Additionally, this patch:
>  - moves the 'igt_assert(tmpe);' to between the assignment and first use
>    of 'tmpe'.
>  - sets 'e = NULL;' and checks if this changes to decide to skip the
>    test.

And here just write that you changes assert into skip_on and why.

> 
>  v7: unifies the two patches from v6; creates 'tmpe' for code clarity;
>      sets 'e = NULL;' to decide to skip the the test if still NULL by
>      the end of 'find_first_available_engine()'; places
>      'igt_assert(tmpe);' between assignment and first use of 'tmpe';
> 
> Signed-off-by: Peter Senna Tschudin <me at petersenna.com>
> ---
>  tests/intel/gem_exec_capture.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/intel/gem_exec_capture.c b/tests/intel/gem_exec_capture.c
> index 57b178f3e..b79268443 100644
> --- a/tests/intel/gem_exec_capture.c
> +++ b/tests/intel/gem_exec_capture.c
> @@ -662,13 +662,19 @@ static bool needs_recoverable_ctx(int fd)
>  
>  #define find_first_available_engine(fd, ctx, e, saved) \
>  	do { \
> +		struct intel_execution_engine2 *tmpe = NULL; \
> +		e = NULL; \

Add here newline to improve readability:
        \

With that added feel free to add my r-b.
I do not see your patch on patchwork, maybe resend as a new one?

Regards,
Kamil

>  		ctx = intel_ctx_create_all_physical(fd); \
>  		igt_assert(ctx); \
> -		for_each_ctx_engine(fd, ctx, e) \
> -			for_each_if(gem_class_can_store_dword(fd, e->class)) \
> +		for_each_ctx_engine(fd, ctx, tmpe) { \
> +			igt_assert(tmpe); \
> +			if(gem_class_can_store_dword(fd, tmpe->class)) { \
> +				saved = configure_hangs(fd, tmpe, ctx->id); \
> +				e = &saved.engine; \
>  				break; \
> -		igt_assert(e); \
> -		saved = configure_hangs(fd, e, ctx->id); \
> +			} \
> +		} \
> +		igt_skip_on_f(e == NULL, "no capable engine found\n"); \
>  	} while(0)
>  
>  static void many(int fd, int dir, uint64_t size, unsigned int flags)
> -- 
> 2.44.0
> 


More information about the igt-dev mailing list