[PATCH 1/1] drm/xe: Only use reserved BCS instances for usm migrate exec queue

Welty, Brian brian.welty at intel.com
Tue May 14 01:24:23 UTC 2024



On 5/8/2024 11:18 AM, Welty, Brian wrote:
> 
> 
> On 5/7/2024 7:42 PM, Matthew Brost wrote:
>> On Tue, May 07, 2024 at 04:59:25PM -0700, Welty, Brian wrote:
>>>
>>>
>>> On 4/15/2024 12:04 PM, Matthew Brost wrote:
>>>> The GuC context scheduling queue is 2 entires deep, thus it is possible

Minor nitpick.  Maybe this ping will help get it merged.
Typo above....  'entires' above should be 'entries'.

I gave R-b earlier below.



>>>> for a migration job to be stuck behind a fault if migration exec queue
>>>> shares engines with user jobs. This can deadlock as the migrate exec
>>>> queue is required to service page faults. Avoid deadlock by only using
>>>> reserved BCS instances for usm migrate exec queue.
>>>
>>> So the underlying concept was always broken here?
>>>
>>
>> It seems to be broken.
>>
>>> With the mask of more than one engine, the virtual engine still won't 
>>> always
>>> pick an idle engine?  HW may end up picking an engine and
>>
>> I thought the GuC would always pick the idle hardware engine but it
>> doesn't appear to be the case (I can see a clear deadlock before this
>> patch, resolved after). Maybe the GuC considers 1 exec queue on the
>> engine idle so it doesn't pick resevered engine?
>>
>> We probably should follow up with GuC team on this but for PVC as a SDV,
>> I think we should get this merged.
> 
> Makes sense.
> 
> Reviewed-by: Brian Welty <brian.welty at intel.com>
> 
> 
>>
>> Matt
>>
>>> confusing it to have been idle?   Because the extra 2-depth thing is not
>>> being considered?
>>>
>>>
>>>>
>>>> Fixes: a043fbab7af5 ("drm/xe/pvc: Use fast copy engines as migrate 
>>>> engine on PVC")
>>>> Cc: Matt Roper <matthew.d.roper at intel.com>
>>>> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura at intel.com>
>>>> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_migrate.c | 14 +++++---------
>>>>    1 file changed, 5 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_migrate.c 
>>>> b/drivers/gpu/drm/xe/xe_migrate.c
>>>> index 9f6e9b7f11c8..c37bb7dfcf1f 100644
>>>> --- a/drivers/gpu/drm/xe/xe_migrate.c
>>>> +++ b/drivers/gpu/drm/xe/xe_migrate.c
>>>> @@ -12,8 +12,6 @@
>>>>    #include <drm/ttm/ttm_tt.h>
>>>>    #include <drm/xe_drm.h>
>>>> -#include <generated/xe_wa_oob.h>
>>>> -
>>>>    #include "instructions/xe_mi_commands.h"
>>>>    #include "regs/xe_gpu_commands.h"
>>>>    #include "regs/xe_gtt_defs.h"
>>>> @@ -34,7 +32,6 @@
>>>>    #include "xe_sync.h"
>>>>    #include "xe_trace.h"
>>>>    #include "xe_vm.h"
>>>> -#include "xe_wa.h"
>>>>    /**
>>>>     * struct xe_migrate - migrate context.
>>>> @@ -300,10 +297,6 @@ static int xe_migrate_prepare_vm(struct xe_tile 
>>>> *tile, struct xe_migrate *m,
>>>>    }
>>>>    /*
>>>> - * Due to workaround 16017236439, odd instance hardware copy 
>>>> engines are
>>>> - * faster than even instance ones.
>>>> - * This function returns the mask involving all fast copy engines 
>>>> and the
>>>> - * reserved copy engine to be used as logical mask for migrate engine.
>>>>     * Including the reserved copy engine is required to avoid 
>>>> deadlocks due to
>>>>     * migrate jobs servicing the faults gets stuck behind the job 
>>>> that faulted.
>>>>     */
>>>> @@ -317,8 +310,7 @@ static u32 xe_migrate_usm_logical_mask(struct 
>>>> xe_gt *gt)
>>>>            if (hwe->class != XE_ENGINE_CLASS_COPY)
>>>>                continue;
>>>> -        if (!XE_WA(gt, 16017236439) ||
>>>> -            xe_gt_is_usm_hwe(gt, hwe) || hwe->instance & 1)
>>>> +        if (xe_gt_is_usm_hwe(gt, hwe))
>>>>                logical_mask |= BIT(hwe->logical_instance);
>>>>        }
>>>> @@ -369,6 +361,10 @@ struct xe_migrate *xe_migrate_init(struct 
>>>> xe_tile *tile)
>>>>            if (!hwe || !logical_mask)
>>>>                return ERR_PTR(-EINVAL);
>>>> +        /*
>>>> +         * XXX: Currently only reserving 1 (likely slow) BCS 
>>>> instance on
>>>> +         * PVC, may want to revisit if performance is needed.
>>>> +         */
>>>>            m->q = xe_exec_queue_create(xe, vm, logical_mask, 1, hwe,
>>>>                            EXEC_QUEUE_FLAG_KERNEL |
>>>>                            EXEC_QUEUE_FLAG_PERMANENT |


More information about the Intel-xe mailing list