[PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
John Harrison
john.c.harrison at intel.com
Thu Dec 12 20:06:27 UTC 2024
On 12/12/2024 11:31, Souza, Jose wrote:
> On Thu, 2024-12-12 at 10:59 -0800, John Harrison wrote:
>> On 12/12/2024 10:17, Souza, Jose wrote:
>>> On Wed, 2024-10-02 at 17:46 -0700, John.C.Harrison at Intel.com wrote:
>>>> From: John Harrison <John.C.Harrison at Intel.com>
>>>>
>>>> The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
>>>> is definitely not a GuC CT thing. So give it its own section heading.
>>>> The snapshot itself is really a capture of the submission backend's
>>>> internal state. Although all it currently prints out is the submission
>>>> contexts. So label it as 'Contexts'. If more general state is added
>>>> later then it could be change to 'Submission backend' or some such.
>>>>
>>>> Further, everything from the GuC CT section onwards is GT specific but
>>>> there was no indication of which GT it was related to (and that is
>>>> impossible to work out from the other fields that are given). So add a
>>>> GT section heading. Also include the tile id of the GT, because again
>>>> significant information.
>>>>
>>>> Lastly, drop a couple of unnecessary line feeds within sections.
>>>>
>>>> v2: Add GT section heading, add tile id to device section.
>>>>
>>>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>>>> Reviewed-by: Julia Filipchuk <julia.filipchuk at intel.com>
>>>> ---
>>>> drivers/gpu/drm/xe/xe_devcoredump.c | 5 +++++
>>>> drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
>>>> drivers/gpu/drm/xe/xe_device.c | 1 +
>>>> drivers/gpu/drm/xe/xe_guc_submit.c | 2 +-
>>>> drivers/gpu/drm/xe/xe_hw_engine.c | 1 -
>>>> 5 files changed, 9 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
>>>> index d23719d5c2a3..2690f1d1cde4 100644
>>>> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
>>>> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
>>>> @@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
>>>> drm_printf(&p, "Process: %s\n", ss->process_name);
>>>> xe_device_snapshot_print(xe, &p);
>>>>
>>>> + drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
>>>> + drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
>>>> +
>>>> drm_puts(&p, "\n**** GuC CT ****\n");
>>>> xe_guc_ct_snapshot_print(ss->ct, &p);
>>>> +
>>>> + drm_puts(&p, "\n**** Contexts ****\n");
>>>> xe_guc_exec_queue_snapshot_print(ss->ge, &p);
>>> This broke Mesa parser!
>>> It can't now parse the exec_queue context because it was expected to be on the '**** GuC CT ****' section.
>> Then the mesa parse needs to be updated. That was clearly a bug - exec
>> queue contexts are absolutely not GuC CT data and should not be in the
>> GuC CT section.
> Don't matter if it is a bug or not, it broke the parser.
> If this is not reverted we will have older Kernel versions that don't work with newer Mesa and newer Kernel versions that don't with old Mesa.
Debug tools cannot count as UAPI that must never change.
The devcoredump contains much information that is essentially the
internals of the kernel. It is going to change. That is about the only
guarantee that we can make about it. And saying that we must
intentionally break the output of a developer only debug feature in
order to support older mesa is plain wrong. End users do not care about
debug tools. All user applications will still work just perfectly.
We can start adding version numbers to the devcoredump format if we
really need to. But that was already shot down as a bad idea. It is
debug information and not UAPI. So version incompatibilities are
expected from time to time.
John.
>
>> John.
>>
>>>>
>>>> drm_puts(&p, "\n**** Job ****\n");
>>>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
>>>> index 440d05d77a5a..3cc2f095fdfb 100644
>>>> --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
>>>> +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
>>>> @@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
>>>> /* GuC snapshots */
>>>> /** @ct: GuC CT snapshot */
>>>> struct xe_guc_ct_snapshot *ct;
>>>> - /** @ge: Guc Engine snapshot */
>>>> +
>>>> + /** @ge: GuC Submission Engine snapshot */
>>>> struct xe_guc_submit_exec_queue_snapshot *ge;
>>>>
>>>> /** @hwe: HW Engine snapshot array */
>>>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>>> index 09a7ad830e69..030cf703e970 100644
>>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>>> @@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
>>>>
>>>> for_each_gt(gt, xe, id) {
>>>> drm_printf(p, "GT id: %u\n", id);
>>>> + drm_printf(p, "\tTile: %u\n", gt->tile->id);
>>>> drm_printf(p, "\tType: %s\n",
>>>> gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
>>>> drm_printf(p, "\tIP ver: %u.%u.%u\n",
>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>>>> index 0ac4a19ec9cc..8690df699170 100644
>>>> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
>>>> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>>>> @@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
>>>> if (!snapshot)
>>>> return;
>>>>
>>>> - drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
>>>> + drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
>>>> drm_printf(p, "\tName: %s\n", snapshot->name);
>>>> drm_printf(p, "\tClass: %d\n", snapshot->class);
>>>> drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
>>>> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
>>>> index ea6d9ef7fab6..6c9c27304cdc 100644
>>>> --- a/drivers/gpu/drm/xe/xe_hw_engine.c
>>>> +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
>>>> @@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
>>>> if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
>>>> drm_printf(p, "\tRCU_MODE: 0x%08x\n",
>>>> snapshot->reg.rcu_mode);
>>>> - drm_puts(p, "\n");
>>>> }
>>>>
>>>> /**
More information about the Intel-xe
mailing list