[PATCH v2] drm/msm: Update global fault counter when faulty process has already ended
Rob Clark
rob.clark at oss.qualcomm.com
Sun Aug 17 14:43:45 UTC 2025
The patch is in msm-fixes, but I was out last week so haven't had a
chance to send a PR yet
thx
BR,
-R
On Fri, Aug 15, 2025 at 1:10 PM Maíra Canal <mcanal at igalia.com> wrote:
>
> Hi,
>
> Gentle ping on this patch.
>
> Best Regards,
> - Maíra
>
> On 7/20/25 18:42, Maíra Canal wrote:
> > The global fault counter is no longer used since commit 12578c075f89
> > ("drm/msm/gpu: Skip retired submits in recover worker"). However, it's
> > still needed, as we need to handle cases where a GPU fault occurs after
> > the faulting process has already ended.
> >
> > Hence, increment the global fault counter when the submitting process
> > had already ended. This way, the number of faults returned by
> > MSM_PARAM_FAULTS will stay consistent.
> >
> > While here, s/unusuable/unusable.
> >
> > Fixes: 12578c075f89 ("drm/msm/gpu: Skip retired submits in recover worker")
> > Signed-off-by: Maíra Canal <mcanal at igalia.com>
> > ---
> >
> > v1 -> v2: https://lore.kernel.org/dri-devel/20250714230813.46279-1-mcanal@igalia.com/T/
> >
> > * Don't delete the global fault, but instead, increment it when the we get
> > a fault after the faulting process has ended (Rob Clark)
> > * Rewrite the commit message based on the changes.
> >
> > drivers/gpu/drm/msm/msm_gpu.c | 11 ++++++++---
> > 1 file changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index c317b25a8162..416d47185ef0 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -465,6 +465,7 @@ static void recover_worker(struct kthread_work *work)
> > struct msm_gem_submit *submit;
> > struct msm_ringbuffer *cur_ring = gpu->funcs->active_ring(gpu);
> > char *comm = NULL, *cmd = NULL;
> > + struct task_struct *task;
> > int i;
> >
> > mutex_lock(&gpu->lock);
> > @@ -482,16 +483,20 @@ static void recover_worker(struct kthread_work *work)
> >
> > /* Increment the fault counts */
> > submit->queue->faults++;
> > - if (submit->vm) {
> > +
> > + task = get_pid_task(submit->pid, PIDTYPE_PID);
> > + if (!task)
> > + gpu->global_faults++;
> > + else {
> > struct msm_gem_vm *vm = to_msm_vm(submit->vm);
> >
> > vm->faults++;
> >
> > /*
> > * If userspace has opted-in to VM_BIND (and therefore userspace
> > - * management of the VM), faults mark the VM as unusuable. This
> > + * management of the VM), faults mark the VM as unusable. This
> > * matches vulkan expectations (vulkan is the main target for
> > - * VM_BIND)
> > + * VM_BIND).
> > */
> > if (!vm->managed)
> > msm_gem_vm_unusable(submit->vm);
>
More information about the Freedreno
mailing list