[i-g-t 1/1] tests/intel/xe_exec_fault_mode: wait for the completion
Yang, Fei
fei.yang at intel.com
Fri Oct 25 05:15:08 UTC 2024
> On Thu, Oct 24, 2024 at 05:23:22PM -0700, fei.yang at intel.com wrote:
>> From: Fei Yang <fei.yang at intel.com>
>>
>> The execution on GPU is out of order, the completion of the last
>> submission doesn't mean all the jobs are completed. We need to make
>> sure all the jobs are completed before moving on to unbinding the
>> buffer, otherwise the test would run into CAT errors.
>>
>> Signed-off-by: Fei Yang <fei.yang at intel.com>
>> ---
>> tests/intel/xe_exec_fault_mode.c | 10 +++++++++-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/tests/intel/xe_exec_fault_mode.c
>> b/tests/intel/xe_exec_fault_mode.c
>> index 9cc51b7d3..d416c773b 100644
>> --- a/tests/intel/xe_exec_fault_mode.c
>> +++ b/tests/intel/xe_exec_fault_mode.c
>> @@ -305,7 +305,15 @@ test_exec(int fd, struct drm_xe_engine_class_instance *eci,
>> }
>> }
>> if (!(flags & INVALID_FAULT)) {
>> - j = flags & INVALIDATE ? n_execs - 1 : 0;
>> + /*
>> + * For !RACE cases xe_wait_ufence has already been called in above
>> + * for-loop, we should only wait for the completion of the last
>> + * submission here. For RACE cases we need to wait for all submissions
>> + * to complete because the GuC scheduling can be out of order, the
>> + * completion of the last submission doesn't mean all submission are
>> + * completed.
>> + */
>> + j = (flags & INVALIDATE && !(flags & RACE)) ? n_execs - 1 : 0;
>
> This makes sense and xe_exec_threads has similar code to this in function test_compute_mode() and given that is threaded test much more likely to hit this race.
>
> I checked xe_exec_compute_mode and I believe we need a similar patch for that test too. While you are here can you fixup that test too?
Sure, I will give it a shot. The code looks a bit different, and there is
a 1 second sleep after the submissions, I guess that's probably why we are
not seeing a problem yet, escpecially on real hardware where 1 second is
sufficient for all submissions to complete.
> This patch LGTM though, with that:
> Reviewed-by: Matthew Brost <matthew.brost at intel.com>
Could you help merge this patch too? I don't have merge right.
>>
>> for (i = j; i < n_execs; i++) {
>> int64_t timeout = NSEC_PER_SEC;
>> --
>> 2.25.1
>>
More information about the igt-dev
mailing list