[Intel-xe] XE sched review

Dani Liberman dliberman at habana.ai
Wed Nov 1 15:33:57 UTC 2023


Hi,

As part of the upstream review of the Xe driver, I reviewed the XE code which relates to the DRM scheduler. Here are the issues/questions:

guc exec queue states:
1. Please elaborate on the states as they are very fundamental part of the guc submission (banned, destroyed, etc..).
2. guc state is an atomic variable but there is no locking in the sched related code and the state can be changed right after getting it. Why not change it to non atomic variable?

guc_exec_queue_run_job():
1. if (!exec_queue_killed_or_banned(q) && !xe_sched_job_is_error(job)): in case this condition is false, and the job is not submitted. Why still returning fence instead of NULL?
2. I think that register_engine() shouldn't be part of run_job:
   - It is part of the init procedure so I think it is more appropriate to put it in guc_exec_queue_init().
   - Not sure if it's possible but the guc status updates asynchronously and if exec_queue got unregistered (after g2h XE_GUC_ACTION_DEREGISTER_CONTEXT_DONE) while in the middle of run_job, we might register it again.
3. LR: AFAIU LR exec_queue will use the drm sched only once and we don't care about the result (from drm POV). Maybe in LR mode, it's better to move the submit_exec_queue() right after emit_job() in xe_exec_ioctl() and
       not using drm sched at all.

guc_exec_queue_start(): using deprecated function drm_sched_resubmit_jobs().

Thanks,
Dani
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-xe/attachments/20231101/9490a282/attachment-0001.htm>


More information about the Intel-xe mailing list