[Intel-gfx] [PATCH v2] drm/i915/pcode: Give the punit time to settle before fatally failing

Andi Shyti andi.shyti at linux.intel.com
Tue Feb 7 10:40:31 UTC 2023


Hi Andrzej,

> > During module load the punit might still be busy with its booting
> > routines. During this time we try to communicate with it but we
> > fail because we don't receive any feedback from it and we return
> > immediately with a -EINVAL fatal error.
> > 
> > At this point the driver load is "dramatically" aborted. The
> > following error message notifies us about it.
> > 
> >     i915 0000:4d:00.0: drm_WARN_ON_ONCE(timeout_base_ms > 3)
> > 
> > It would be enough to wait a little in order to give the punit
> > the chance to come up bright and shiny, ready to interact with
> > the driver.
> > 
> > Wait up 10 seconds for the punit to settle and complete any
> > outstanding transactions upon module load. If it still fails try
> > again with a longer timeout, 180s, 3 minutes. If it still fails
> > then return -EPROBE_DEFER, in order to give the punit a second
> > chance.
> > 
> > Even if these timers might look long, we should consider that the
> > punit, depending on the platforms, might need long times to
> > complete its routines. Besides we want to try anything possible
> > to move forward before deciding to abort the driver's load.
> > 
> > The issue has been reported in:
> > 
> >     https://gitlab.freedesktop.org/drm/intel/-/issues/7814
> > 
> > The changes in this patch are valid only and uniquely during
> > boot. The common transactions with the punit during the driver's
> > normal operation are not affected.
> > 
> > Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty at intel.com>
> > Co-developed-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Andi Shyti <andi.shyti at linux.intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> 
> With improved commit message it looks OK for me. There is still question why
> it takes so long for punit to become ready.

It's hardware and some punit operations require that much. There
are some documents floating around that have all these
calculations.

Some devices require even more time and, after consulting with
hardware guys, Aravind had to increase the timeout to 6 minutes!

Boot routines should not require this much, thus the 20 seconds.

> Anyway:
> Reviewed-by: Andrzej Hajda <andrzej.hajda at intel.com>

Thanks a lot for looking into this, Andrzej!

Andi


More information about the dri-devel mailing list