[PATCH] drm/xe/guc: Add more GuC load error status codes
John Harrison
john.c.harrison at intel.com
Sat Jul 26 00:19:37 UTC 2025
On 7/25/2025 12:29 PM, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of John.C.Harrison at Intel.com
> Sent: Friday, July 25, 2025 12:12 PM
> To: Intel-Xe at Lists.FreeDesktop.Org
> Cc: Harrison, John C <john.c.harrison at intel.com>
> Subject: [PATCH] drm/xe/guc: Add more GuC load error status codes
>> From: John Harrison <John.C.Harrison at Intel.com>
>>
>> The GuC load process will abort if certain status codes (which are
>> indicative of a fatal error) are reported. Otherwise, it keeps waiting
>> until the 'success' code is returned. New error codes have been added
>> in recent GuC releases, so add support for aborting on those as well.
>>
>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>> ---
>> drivers/gpu/drm/xe/abi/guc_errors_abi.h | 3 +++
>> drivers/gpu/drm/xe/xe_guc.c | 11 +++++++++++
>> 2 files changed, 14 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/abi/guc_errors_abi.h b/drivers/gpu/drm/xe/abi/guc_errors_abi.h
>> index ecf748fd87df..ad76b4baf42e 100644
>> --- a/drivers/gpu/drm/xe/abi/guc_errors_abi.h
>> +++ b/drivers/gpu/drm/xe/abi/guc_errors_abi.h
>> @@ -63,6 +63,7 @@ enum xe_guc_load_status {
>> XE_GUC_LOAD_STATUS_HWCONFIG_START = 0x05,
>> XE_GUC_LOAD_STATUS_HWCONFIG_DONE = 0x06,
>> XE_GUC_LOAD_STATUS_HWCONFIG_ERROR = 0x07,
>> + XE_GUC_LOAD_STATUS_BOOTROM_VERSION_MISMATCH = 0x08,
>> XE_GUC_LOAD_STATUS_GDT_DONE = 0x10,
>> XE_GUC_LOAD_STATUS_IDT_DONE = 0x20,
>> XE_GUC_LOAD_STATUS_LAPIC_DONE = 0x30,
>> @@ -75,6 +76,8 @@ enum xe_guc_load_status {
>> XE_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START,
>> XE_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73,
>> XE_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID = 0x74,
>> + XE_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR = 0x75,
>> + XE_GUC_LOAD_STATUS_INVALID_FTR_FLAG = 0x76,
>> XE_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END,
>>
>> XE_GUC_LOAD_STATUS_READY = 0xF0,
>> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
>> index 1ca7f4f27e26..315ef74bb5ff 100644
>> --- a/drivers/gpu/drm/xe/xe_guc.c
>> +++ b/drivers/gpu/drm/xe/xe_guc.c
>> @@ -992,11 +992,14 @@ static int guc_load_done(u32 status)
>> case XE_GUC_LOAD_STATUS_GUC_PREPROD_BUILD_MISMATCH:
>> case XE_GUC_LOAD_STATUS_ERROR_DEVID_INVALID_GUCTYPE:
>> case XE_GUC_LOAD_STATUS_HWCONFIG_ERROR:
>> + case XE_GUC_LOAD_STATUS_BOOTROM_VERSION_MISMATCH:
>> case XE_GUC_LOAD_STATUS_DPC_ERROR:
>> case XE_GUC_LOAD_STATUS_EXCEPTION:
>> case XE_GUC_LOAD_STATUS_INIT_DATA_INVALID:
>> case XE_GUC_LOAD_STATUS_MPU_DATA_INVALID:
>> case XE_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID:
>> + case XE_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR:
>> + case XE_GUC_LOAD_STATUS_INVALID_FTR_FLAG:
>> return -1;
>> }
>>
>> @@ -1146,6 +1149,14 @@ static void guc_wait_ucode(struct xe_guc *guc)
>> xe_gt_err(gt, "illegal register in save/restore workaround list\n");
>> break;
>>
>> + case XE_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR:
>> + xe_gt_err(gt, "illegal workaround KLV data\n");
>> + break;
>> +
>> + case XE_GUC_LOAD_STATUS_INVALID_FTR_FLAG:
>> + xe_gt_err(gt, "illegal feature flag specified\n");
>> + break;
>> +
> Given that XE_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR and
> XE_GUC_LOAD_STATUS_INVALID_FTR_FLAG are indexed after
> XE_GUC_LOAD_STATUS_HWCONFIG_START in the xe_guc_load_status enum list,
> they should also probably be placed after XE_GUC_LOAD_STATUS_HWCONFIG_START
> in this case switch branch, but I'm not going to block on it because the ordering
> probably isn't that important.
Actually in terms of enum value, HWCONFIG_START is before all the other
existing cases as well. The ordering of cases within a switch is
entirely down to coder preference - whatever makes the code more
readable/efficient. E.g. grouping logical values, allowing values to
fall into subsequent cases, etc. even though their absolute values might
be wildly scattered. However, I just realised the basic 'invalid init
data' code is missing an explanation too. So, I can move the hwconfig to
the start in the respin easily enough.
John.
>
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> -Jonathan Cavitt
>
>> case XE_GUC_LOAD_STATUS_HWCONFIG_START:
>> xe_gt_err(gt, "still extracting hwconfig table.\n");
>> break;
>> --
>> 2.49.0
>>
>>
More information about the Intel-xe
mailing list