[PATCH 1/2] drm/i915/selftests: Fix engine reset count storage for multi-tile
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Dec 7 13:45:56 UTC 2023
On 07/12/2023 11:46, Andi Shyti wrote:
> On Thu, Dec 07, 2023 at 11:43:28AM +0000, Tvrtko Ursulin wrote:
>>
>> On 07/12/2023 11:26, Andi Shyti wrote:
>>> Hi Tvrtko,
>>>
>>>> Engine->id namespace is per-tile so struct igt_live_test->reset_engine[]
>>>> needs to be two-dimensional so engine reset counts from all tiles can be
>>>> stored with no aliasing. With aliasing, if we had a real multi-tile
>>>> platform, the reset counts would be incorrect for same engine instance on
>>>> different tiles.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>> Fixes: 0c29efa23f5c ("drm/i915/selftests: Consider multi-gt instead of to_gt()")
>>>> Reported-by: Alan Previn Teres Alexis <alan.previn.teres.alexis at intel.com>
>>>> Cc: Tejas Upadhyay <tejas.upadhyay at intel.com>
>>>> Cc: Andi Shyti <andi.shyti at linux.intel.com>
>>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>>>
>>> sorry for being late here... the patch makes sense to me and the
>>> CI failures don't look related.
>>>
>>> Reviewed-by: Andi Shyti <andi.shyti at linux.intel.com>
>>
>> Thanks pushed!
>>
>> There is more work to be done with the fact i915_reset_engine_count has it's
>> own aliasing when used like this, but I opted to leave that for some other
>> time.
>
> feel free to share if you have some preparatory work done already
> and I can try to help out. Otherwise I can take a look at it, as
> well.
I don't have any patches I was just noticed when doing this that even
though i915_reset_engine_count takes the engine as parameter, the
i915->gpu_error is a single gt construct and as such I think using
i915_reset_engine_count from per gt selftests is a mismatch.
I thought options were to add engine reset counts in the engine itself
and use that from selftests. Leaving i915_reset_engine_count to be used
from error capture paths. And it probably needs to be renamed
accordingly so it is not misleading.
But then there may be issues around virtual engines though which this
helper conveniently and quietly side stepped.
At that point I stopped thinking about it, given how real multi-tile for
i915 is not happening, I didn't see it worth the effort. Still the sour
taste of a mess remains so if you can think of an elegant and relatively
cheap solution I think it would be good to tidy.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list