[PATCH] drm/amdgpu: keep job->vm in amdgpu_job_prepare_job
YuanShang Mao (River)
YuanShang.Mao at amd.com
Tue Jul 29 02:41:08 UTC 2025
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Alexander
> I didn't think we actually resubmitting jobs anymore. How are we ending up trying to resubmit in the first place?
It is not about resubmitting. amdgpu_vm_generation will check if the VM is valid for this job to use. For example, if a gfx job depends on an sdma job, which is already cancelled, then the gfx job should be skipped.
Perhaps the dependencies between tasks should be resolved by the drm instead of amdgpu.
uint64_t amdgpu_vm_generation(struct amdgpu_device *adev, struct amdgpu_vm *vm)
{
uint64_t result = (u64)atomic_read(&adev->vram_lost_counter) << 32;
if (!vm)
return result;
result += lower_32_bits(vm->generation);
/* Add one if the page tables will be re-generated on next CS */
if (drm_sched_entity_error(&vm->delayed))
++result;
return result;
}
Thanks
River
-----Original Message-----
From: Alex Deucher <alexdeucher at gmail.com>
Sent: Tuesday, July 29, 2025 2:10 AM
To: YuanShang Mao (River) <YuanShang.Mao at amd.com>
Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>; amd-gfx at lists.freedesktop.org; cao, lin <lin.cao at amd.com>; Zhang, Tiantian (Celine) <Tiantian.Zhang at amd.com>
Subject: Re: [PATCH] drm/amdgpu: keep job->vm in amdgpu_job_prepare_job
On Mon, Jul 28, 2025 at 5:01 AM YuanShang Mao (River) <YuanShang.Mao at amd.com> wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Alexander
>
> Since Christian is on vacation. Could you help review the below patch?
> If set job->vm to null in amdgpu_job_prepare_job, the job which should be skipped in amdgpu_job_run will be submitted unexpectedly.
I think we can just remove the amdgpu_vm_generation() check in amdgpu_job_run(). I didn't think we actually resubmitting jobs anymore. How are we ending up trying to resubmit in the first place?
Alex
>
> Thanks
> River
>
>
> -----Original Message-----
> From: YuanShang Mao (River) <YuanShang.Mao at amd.com>
> Sent: Wednesday, July 23, 2025 5:06 PM
> To: amd-gfx at lists.freedesktop.org
> Cc: Koenig, Christian <Christian.Koenig at amd.com>; YuanShang Mao
> (River) <YuanShang.Mao at amd.com>
> Subject: [PATCH] drm/amdgpu: keep job->vm in amdgpu_job_prepare_job
>
> job->vm is used in function amdgpu_job_run to get the page
> table re-generation counter and decide whether the job should be skipped.
>
> Signed-off-by: YuanShang <YuanShang.Mao at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 45febdc2f349..18998343815e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -360,13 +360,6 @@ amdgpu_job_prepare_job(struct drm_sched_job *sched_job,
> dev_err(ring->adev->dev, "Error getting VM ID (%d)\n", r);
> goto error;
> }
> - /*
> - * The VM structure might be released after the VMID is
> - * assigned, we had multiple problems with people trying to use
> - * the VM pointer so better set it to NULL.
> - */
> - if (!fence)
> - job->vm = NULL;
> return fence;
> }
>
> --
> 2.25.1
>
More information about the amd-gfx
mailing list