[Intel-xe] [PATCH 3/4] drm/xe/guc: Fix handling of GUC_HXG_TYPE_NO_RESPONSE_BUSY
Matthew Brost
matthew.brost at intel.com
Wed Nov 15 14:12:32 UTC 2023
On Wed, Nov 15, 2023 at 07:00:45PM +0100, Michal Wajdeczko wrote:
>
>
> On 15.11.2023 11:41, Matthew Brost wrote:
> > On Wed, Nov 15, 2023 at 12:37:47PM +0100, Michal Wajdeczko wrote:
> >> If GuC responds with the NO_RESPONSE_BUSY message, we extend
> >> our timeout while waiting for the actual response, but we wrongly
> >> assumed that the next message will be RESPONSE_SUCCESS, missing
> >> that we still can get RESPONSE_FAILURE.
> >>
> >> Change the condition for the expected message type, using only
> >> common bits from RESPONSE_SUCCESS and RESPONSE_FAILURE (as they
> >> differ, by ABI design, only by the last bit).
> >>
> >> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
> >> ---
> >> drivers/gpu/drm/xe/xe_guc.c | 5 +++--
> >> 1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> >> index ddbe64eb0f3b..16c233f120c6 100644
> >> --- a/drivers/gpu/drm/xe/xe_guc.c
> >> +++ b/drivers/gpu/drm/xe/xe_guc.c
> >> @@ -671,9 +671,10 @@ int xe_guc_mmio_send_recv(struct xe_guc *guc, const u32 *request,
> >> header = xe_mmio_read32(gt, reply_reg);
> >> if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) ==
> >> GUC_HXG_TYPE_NO_RESPONSE_BUSY) {
> >> + u32 resp_bits = GUC_HXG_TYPE_RESPONSE_SUCCESS & GUC_HXG_TYPE_RESPONSE_FAILURE;
> >
> > I think this works but is kinda goofy. It only works because 0x7 & 0x6
> > are values. If these are defined as 0x6 & 0x1 this doesn't work.
>
> but values for SUCCESS/FAILURE (as said in the commit message) were
Didn't read the commit see that now.
> designed to work that way and the HXG spec is stable AFAIK
>
Can you make this a new define then? I had to check the values of
GUC_HXG_TYPE_RESPONSE_SUCCESS & GUC_HXG_TYPE_RESPONSE_FAILURE to
understand why this works. Or at minimum add a comment as just looking
at code it doesn't appear to work when indeed it does.
> >
> >> + u32 resp_mask = FIELD_PREP(GUC_HXG_MSG_0_TYPE, resp_bits);
> >>
> >> - ret = xe_mmio_wait32(gt, reply_reg, GUC_HXG_MSG_0_TYPE,
> >> - FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_RESPONSE_SUCCESS),
> >> + ret = xe_mmio_wait32(gt, reply_reg, resp_mask, resp_mask,
> >> 1000000, &header, false);
> >>
> >
> > Here now we'd need to check for a failure in the header field.
>
> checks for FAILURE/SUCCESS are done in "normal" path below, no need for
> extra checks here
>
Ah, yes. See that now.
> >
> > With all of this, I believe it is better just to wait for
> > GUC_HXG_TYPE_RESPONSE_SUCCESS and have xe_mmio_wait32 timeout on all
> > other returns. Getting a failure case is rare and we can live a timeout.
>
> but are writing production quality driver, no?
>
> and even for the "rare" cases there could be a need for different
> handling of timeout vs error response (as example take version handshake
> between VF and GuC - timeout means game over, while error might indicate
> to try with different ABI level)
>
Yea ok, makes a bit more sense now.
Matt
> >
> > Matt
> >
> >> if (unlikely(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, header) !=
> >> --
> >> 2.25.1
> >>
More information about the Intel-xe
mailing list