[PATCH] drm/i915: Add debug print about hw config table size
Daniele Ceraolo Spurio
daniele.ceraolospurio at intel.com
Tue Dec 24 19:10:04 UTC 2024
On 12/24/2024 10:13 AM, John Harrison wrote:
> On 12/23/2024 15:20, Daniele Ceraolo Spurio wrote:
>> On 12/20/2024 5:19 PM, John.C.Harrison at Intel.com wrote:
>>> From: John Harrison<John.C.Harrison at Intel.com>
>>>
>>> Add debug info to help investigate a very rare bug:
>>> https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13385
>>>
>>> Signed-off-by: John Harrison<John.C.Harrison at Intel.com>
>>> ---
>>> drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c
>>> index b67a15f742762..868195c33f5b3 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c
>>> @@ -7,6 +7,7 @@
>>> #include "gt/intel_hwconfig.h"
>>> #include "i915_drv.h"
>>> #include "i915_memcpy.h"
>>> +#include "intel_guc_print.h"
>>>
>>> /*
>>> * GuC has a blob containing hardware configuration information (HWConfig).
>>> @@ -42,6 +43,8 @@ static int __guc_action_get_hwconfig(struct intel_guc *guc,
>>> };
>>> int ret;
>>>
>>> + guc_dbg(guc, "Querying HW config table: size = %d, offset = 0x%08X\n",
>>> + ggtt_size, ggtt_offset);
>>
>> This seems to result in a double-log where the first print has no
>> useful information, e.g.:
>>
>> [drm:__guc_action_get_hwconfig [i915]] GT0: GUC: Querying HW config
>> table: size = 0, offset = 0x00000000
>> [drm:__guc_action_get_hwconfig [i915]] GT0: GUC: Querying HW config
>> table: size = 752, offset = 0x00D05000
>>
>> Given that only the second log is useful, IMO better to move the
>> guc_dbg to guc_hwconfig_fill_buffer(), because the info needed for
>> the second print is available there and it is only called once.
> I disagree that the first print has no useful information. It tells us
> that a call is being made and these are the parameters. We do not know
> what the failure is. It seems highly unlikely that the size changes
> from query to the next given that the table is a fixed entity. It is
> much more likely to be a caching type issue with GuC reading data the
> KMD did not write. If so, GuC could potentially read non-zero data for
> the initial size query and complain that data is invalid.
>
> The intention is to report all calls with their parameters to try to
> narrow down exactly what calls are not working.
But we don't need both prints to know which of the 2 calls has failed,
if the error comes before we get the second print then we know the
failure was in the first call, otherwise it was in the second.
Daniele
>
> John.
>
>
>>
>> Daniele
>>
>>> ret = intel_guc_send_mmio(guc, action, ARRAY_SIZE(action), NULL, 0);
>>> if (ret == -ENXIO)
>>> return -ENOENT;
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20241224/a3284960/attachment-0001.htm>
More information about the Intel-gfx
mailing list