[Intel-gfx] [PATCH v5 01/10] drm/i915/guc: Update GuC ADS size for error capture lists

Lucas De Marchi lucas.demarchi at intel.com
Thu Feb 3 21:40:48 UTC 2022


On Thu, Feb 03, 2022 at 12:40:54PM -0800, Teres Alexis, Alan Previn wrote:
>Apologies, please ignore last reply (wrestling with my VNC). Proper reply:
>
>On Thu, 2022-02-03 at 12:39 -0800, Alan Previn Teres Alexis wrote:
>>
>>
>> On Thu, 2022-02-03 at 12:04 -0800, Lucas De Marchi wrote:
>> > On Thu, Feb 03, 2022 at 11:03:24AM -0800, Matthew Brost wrote:
>> > > On Wed, Jan 26, 2022 at 02:46:19PM -0800, Lucas De Marchi wrote:
>> > > > On Wed, Jan 26, 2022 at 02:48:13AM -0800, Alan Previn wrote:
>> > > > > Update GuC ADS size allocation to include space for
>> > > > > the lists of error state capture register descriptors.
>> > > > >
>> > > > > Also, populate the lists of registers we want GuC to report back to
>> > > > > Host on engine reset events. This list should include global,
>> > > > > engine-class and engine-instance registers for every engine-class
>> > > > > type on the current hardware.
>> > > > >
>> > > > > NOTE: Start with a sample table of register lists to layout the
>> > > > > framework before adding real registers in subsequent patch.
>> > > > >
>> > > > > Signed-off-by: Alan Previn <alan.previn.teres.alexis at intel.com>
>> > > > > ---
>> > > >
>> > > > ...
>> > > >
>> > > > > static void __guc_ads_init(struct intel_guc *guc)
>> > > > > {
>> > > > >       struct intel_gt *gt = guc_to_gt(guc);
>> > > > > @@ -573,9 +553,9 @@ static void __guc_ads_init(struct intel_guc *guc)
>> > > > >
>> > > > >       base = intel_guc_ggtt_offset(guc, guc->ads_vma);
>> > > > >
>> > > > > -     /* Capture list for hang debug */
>> > > > > -     guc_capture_list_init(guc, blob);
>> > > > > -
>> > > > > +     /* Lists for error capture debug */
>> > > > > +     intel_guc_capture_prep_lists(guc, (struct guc_ads *)blob, base,
>> > > >
>> > > > no, please don't cast/export struct guc_ads like this. We can't really
>> > > > dereference it since it may be in IO memory.
>> > > >
>> > > > See https://patchwork.freedesktop.org/series/99378/ with the huge
>> > > > refactor in this file to make it conform to the rules of accessing IO
>> > > > memory.
>> > > >
>> > > > Maybe this list could be appended in the same reglist buffer and we just
>> > > > copy it once to its final location, like we are doing with the reglist?
>> > > >
>> > >
>> > > Agree with Lucas here, let's create the capture list on probe and store
>> > > it somewhere in the device data. On ADS population this more or less
>> > > turns into a memcpy (or after Lucas's series a dma-buf-map call). And on
>> > > fini, just free to stored data. The create capture list IMO is fine to
>> > > be done in an external file and return the data to the ADS code but
>> > > let's make sure everyone is on board with that. Maybe ping Lucas
>> > > directly about this?
>> >
>> > yeah, sounds good to me. My worry is exporting ADS struct layout to
>> > other compilation units. Asking for the entire ads blob
>> > (or what would be dma_buf_map in my patches, or iosys_map in the new
>> > version I will send soon) could quickly spread too much knowledge about it to
>> > the rest of the driver.
>> >
>>
>
>I'm in partial disagreement with above. Based on above statement, are we enforcing
>that we must always continue to only have ADS know the 2nd level blobl structure layout?

no

>Doesn't that also force that intelligence of knowing its substructure contents to
>always be in ADS only? So ADS C file continues to grow larger and larger with completely
>orthogonal domain specific knowledge? (golden context: engine info,
>error-capture: debug registers, etc..).

no, see below

>I believe ADS should really let the substructure headers be accessible to external
>modules to deal with the parsing, size determination and preparing its contents.
>
>NOTE: see my next comment specifically regarding the pointer access.
>
>
>> I think we should at most export the speficic offset inside the buffer
>>
>> > another compilation unit can fill out. In that case with the
>> > iosys_buf API we would return a shallow copy of our guc->ads_map +
>> > offset.
>> >
>> > And the other alternative would be as you suggested: save the list in a
>> > temporary buffer and when needed call intel_guc_ads() to populate that
>> > into ads in the right place.
>> >
>>
>Alan: The first part of above is already what is happening in my series today,
>we have a seperate register list  in the intel_guc_capture module

no, what you have in this patch I commented on is:

>> > > > > +     /* Lists for error capture debug */
>> > > > > +     intel_guc_capture_prep_lists(guc, (struct guc_ads *)blob, base,

you are passing the complete ads blob outside. I'd even try to avoid passing 
the second level struct depending on the case, but that would be
much more acceptable.



>that also determines (based on device and fusing) which registers
>to include or exclude. There is are static global lists and
>per-gt-allocated lists (based on fuse masks). The latter
>is not in current rev but already commented as planned change.
>
>So in response to the original review comment, I can change this
>patch so that the ADS gets a regular heap-kzalloc-allocated pointer and
>size from the error-capture module and ADS do the copying into the
>io-mem ptr but i want to ensure the layout of the error-capture
>lists are not private to ADS only.
>
>Are we okay with that?

I think that is fine.  Either of these 2 solutions would actually be
fine with me. Note that in my patch series I use both depending on the
case. The reglist uses a temporary buffer allocated on gt init, while
all the remaining read/write go directlry to the ads_map. There is even the
intel_guc_engine_usage_record_map() that exports the second level struct
and intel_guc_submission.c reads that out (see
https://patchwork.freedesktop.org/patch/471052/?series=99378&rev=1)
When doing the patch I was tempted to change that with something like

	intel_guc_ads_get_engine_usage_record(engine, &last_switch, &ctx_id, &total)

but decided that it would not be future-proof.

Lucas De Marchi

>
>
>>
>> > The integration with iosys-map I can figure out if my patch series lands
>> > after this one, or I can help rebasing this otherwise. But IMO we
>> > should not pass the plain blob pointer around regardless of iosys-map.
>> >
>> >
>> > thanks
>> > Lucas De Marchi
>> >
>> > > Specific patch Lucas's is referencing:
>> > > https://patchwork.freedesktop.org/patch/471051/?series=99378&rev=1
>> > >
>> > > Matt
>> > >
>> > > > Lucas De Marchi
>


More information about the Intel-gfx mailing list