[Intel-gfx] [PATCH v5 08/10] drm/i915/guc: Plumb GuC-capture into gpu_coredump
Teres Alexis, Alan Previn
alan.previn.teres.alexis at intel.com
Sat Feb 12 00:31:00 UTC 2022
Thanks Umesh for reviewing this for me and for the R-v-b.
Responding on one comment below
On Thu, 2022-02-10 at 18:11 -0800, Umesh Nerlige Ramappa wrote:
> On Wed, Jan 26, 2022 at 02:48:20AM -0800, Alan Previn wrote:
> > Add a flags parameter through all of the coredump creation
> > functions. Add a bitmask flag to indicate if the top
> > level gpu_coredump event is triggered in response to
> > a GuC context reset notification.
> > lgtm. In general the implementation differences between submission
> backends (guc vs execlists) need to be vfuncs so we can avoid having to
> use the flags and conditions. Regardless, with the above comments
> addressed, this is:
I do agree its much cleaner and consistent with other engine related
vfuncs that differentiate guc vs execlist however, in the case of the
gpu coredump operation, we can have cases where GuC submission is enabled
but the capture_engine function may have been triggered NOT by GuC.
Consider the following scenarios:
1. A context was reset by the GuC as per the context-level execution
quanta + preemption timeout... OR... a context failure caught
by HW and routed to GuC.
2. If the user had started i915 with modparam reset == 1 and hangcheck
enabled, but the context was executed with an infinite execution
quanta, then heartbeat could trigger a full GT reset and capture
all engine states despite having GuC submission and in this case,
we do want manual HW register dumps by i915 and not rely on GuC.
That said, for now, with the existing gpu_coredump framework that,
idea may not work.
> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
More information about the Intel-gfx