[PATCH] drm/i915: Add debug print about hw config table size

Daniele Ceraolo Spurio daniele.ceraolospurio at intel.com
Tue Dec 24 19:10:04 UTC 2024




On 12/24/2024 10:13 AM, John Harrison wrote:
> On 12/23/2024 15:20, Daniele Ceraolo Spurio wrote:
>> On 12/20/2024 5:19 PM, John.C.Harrison at Intel.com wrote:
>>> From: John Harrison<John.C.Harrison at Intel.com>
>>>
>>> Add debug info to help investigate a very rare bug:
>>>    https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13385
>>>
>>> Signed-off-by: John Harrison<John.C.Harrison at Intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c
>>> index b67a15f742762..868195c33f5b3 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c
>>> @@ -7,6 +7,7 @@
>>>   #include "gt/intel_hwconfig.h"
>>>   #include "i915_drv.h"
>>>   #include "i915_memcpy.h"
>>> +#include "intel_guc_print.h"
>>>   
>>>   /*
>>>    * GuC has a blob containing hardware configuration information (HWConfig).
>>> @@ -42,6 +43,8 @@ static int __guc_action_get_hwconfig(struct intel_guc *guc,
>>>   	};
>>>   	int ret;
>>>   
>>> +	guc_dbg(guc, "Querying HW config table: size = %d, offset = 0x%08X\n",
>>> +		ggtt_size, ggtt_offset);
>>
>> This seems to result in a double-log where the first print has no 
>> useful information, e.g.:
>>
>> [drm:__guc_action_get_hwconfig [i915]] GT0: GUC: Querying HW config 
>> table: size = 0, offset = 0x00000000
>> [drm:__guc_action_get_hwconfig [i915]] GT0: GUC: Querying HW config 
>> table: size = 752, offset = 0x00D05000
>>
>> Given that only the second log is useful, IMO better to move the 
>> guc_dbg to guc_hwconfig_fill_buffer(), because the info needed for 
>> the second print is available there and it is only called once.
> I disagree that the first print has no useful information. It tells us 
> that a call is being made and these are the parameters. We do not know 
> what the failure is. It seems highly unlikely that the size changes 
> from query to the next given that the table is a fixed entity. It is 
> much more likely to be a caching type issue with GuC reading data the 
> KMD did not write. If so, GuC could potentially read non-zero data for 
> the initial size query and complain that data is invalid.
>
> The intention is to report all calls with their parameters to try to 
> narrow down exactly what calls are not working.

But we don't need both prints to know which of the 2 calls has failed, 
if the error comes before we get the second print then we know the 
failure was in the first call, otherwise it was in the second.

Daniele

>
> John.
>
>
>>
>> Daniele
>>
>>>   	ret = intel_guc_send_mmio(guc, action, ARRAY_SIZE(action), NULL, 0);
>>>   	if (ret == -ENXIO)
>>>   		return -ENOENT;
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20241224/a3284960/attachment-0001.htm>


More information about the Intel-gfx mailing list