[PATCH 5/5] drm/msm/dpu: rate limit snapshot capture for mmu faults
Rob Clark
robdclark at gmail.com
Mon Jul 15 19:51:49 UTC 2024
On Mon, Jul 1, 2024 at 12:43 PM Dmitry Baryshkov
<dmitry.baryshkov at linaro.org> wrote:
>
> On Fri, Jun 28, 2024 at 02:48:47PM GMT, Abhinav Kumar wrote:
> > There is no recovery mechanism in place yet to recover from mmu
> > faults for DPU. We can only prevent the faults by making sure there
> > is no misconfiguration.
> >
> > Rate-limit the snapshot capture for mmu faults to once per
> > msm_kms_init_aspace() as that should be sufficient to capture
> > the snapshot for debugging otherwise there will be a lot of
> > dpu snapshots getting captured for the same fault which is
> > redundant and also might affect capturing even one snapshot
> > accurately.
>
> Please squash this into the first patch. There is no need to add code
> with a known defficiency.
>
> Also, is there a reason why you haven't used <linux/ratelimit.h> ?
So, in some ways devcoredump is ratelimited by userspace needing to
clear an existing devcore..
What I'd suggest would be more useful is to limit the devcores to once
per atomic update, ie. if display state hasn't changed, maybe an
additional devcore isn't useful
BR,
-R
>
> >
> > Signed-off-by: Abhinav Kumar <quic_abhinavk at quicinc.com>
> > ---
> > drivers/gpu/drm/msm/msm_kms.c | 6 +++++-
> > drivers/gpu/drm/msm/msm_kms.h | 3 +++
> > 2 files changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c
> > index d5d3117259cf..90a333920c01 100644
> > --- a/drivers/gpu/drm/msm/msm_kms.c
> > +++ b/drivers/gpu/drm/msm/msm_kms.c
> > @@ -168,7 +168,10 @@ static int msm_kms_fault_handler(void *arg, unsigned long iova, int flags, void
> > {
> > struct msm_kms *kms = arg;
> >
> > - msm_disp_snapshot_state(kms->dev);
> > + if (!kms->fault_snapshot_capture) {
> > + msm_disp_snapshot_state(kms->dev);
> > + kms->fault_snapshot_capture++;
>
> When is it decremented?
>
> > + }
> >
> > return -ENOSYS;
> > }
> > @@ -208,6 +211,7 @@ struct msm_gem_address_space *msm_kms_init_aspace(struct drm_device *dev)
> > mmu->funcs->destroy(mmu);
> > }
> >
> > + kms->fault_snapshot_capture = 0;
> > msm_mmu_set_fault_handler(aspace->mmu, kms, msm_kms_fault_handler);
> >
> > return aspace;
> > diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
> > index 1e0c54de3716..240b39e60828 100644
> > --- a/drivers/gpu/drm/msm/msm_kms.h
> > +++ b/drivers/gpu/drm/msm/msm_kms.h
> > @@ -134,6 +134,9 @@ struct msm_kms {
> > int irq;
> > bool irq_requested;
> >
> > + /* rate limit the snapshot capture to once per attach */
> > + int fault_snapshot_capture;
> > +
> > /* mapper-id used to request GEM buffer mapped for scanout: */
> > struct msm_gem_address_space *aspace;
> >
> > --
> > 2.44.0
> >
>
> --
> With best wishes
> Dmitry
More information about the dri-devel
mailing list