[PATCH v3 2/2] drm/xe/devcoredump: Remove IS_ERR_OR_NULL check for kzalloc

Fri Feb 21 01:36:19 UTC 2025

On 2/20/2025 15:54, Lucas De Marchi wrote:
> On Thu, Feb 20, 2025 at 05:29:56PM +0100, Michal Wajdeczko wrote:
>> On 20.02.2025 01:17, Shuicheng Lin wrote:
>>> kzalloc returns a valid pointer or NULL if the allocation fails.
>>> It never returns an error pointer. It is better to check for NULL 
>>> directly.
>>>
>>> Signed-off-by: Shuicheng Lin <shuicheng.lin at intel.com>
>>> Cc: John Harrison <John.C.Harrison at Intel.com>
>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>> ---
>>>  drivers/gpu/drm/xe/xe_devcoredump.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c 
>>> b/drivers/gpu/drm/xe/xe_devcoredump.c
>>> index 60d15e455017..81b9d9bb3f57 100644
>>> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
>>> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
>>> @@ -426,8 +426,8 @@ void xe_print_blob_ascii85(struct drm_printer 
>>> *p, const char *prefix, char suffi
>>>          drm_printf(p, "Offset not word aligned: %zu", offset);
>>>
>>>      line_buff = kzalloc(DMESG_MAX_LINE_LEN, GFP_KERNEL);
>>> -    if (IS_ERR_OR_NULL(line_buff)) {
>>> -        drm_printf(p, "Failed to allocate line buffer: %pe", 
>>> line_buff);
>>> +    if (!line_buff) {
>>> +        drm_printf(p, "Failed to allocate line buffer\n");
>>
>> btw, since this line will be included in the output, where one could
>> expect ascii85 data, shouldn't we print that diagnostic message with
>> some special prefix to make it clear there is nothing to parse? like
>>
>>     "# Failed to allocate internal data\n"
>>
>> also since caller may have already provided a prefix, shouldn't we also
>> include it in this diagnostic message?
>>
>>     "%s%s# Failed to allocate internal data\n",
>>     prefix ?: "",
>>     prefix ? ": " : ""
>
> or stop printing and return an error. we are missing the `.error: ...`
> already that is used in other places.
>
> $ git grep '\.error: ' -- drivers/gpu/drm/xe
> drivers/gpu/drm/xe/xe_vm.c:             drm_printf(p, "[0].error: 
> %li\n", PTR_ERR(snap));
> drivers/gpu/drm/xe/xe_vm.c:                     drm_printf(p, 
> "[%llx].error: %li\n", snap->snap[i].ofs,
This is the place that should be printing an error. The whole point of 
this helper is that it wraps up all the blob output. However, do we need 
to distinguish between a non-capture-process error (e.g. bad VM object) 
versus an error in the capture itself (e.g. out of memory converting the 
binary data to a text string)?

Not sure what error routes there are in the VM capture? Are they things 
that are important to include in the devcoredump because they have 
significant meaning about what caused the hang? Or are the only possible 
errors related to the capture process itself - failing to allocate 
memory to store the capture or such?

If the only errors are capture related then yes, just change this line 
to print "[%prefix].error: %errno\n". But if there is use to distinguish 
between bad VM objects and failed captures, then maybe this one should 
be "[%prefix].capture_error: %errno\n" or something?

John.

>
> Lucas De Marchi
>
>
>
>
>>
>>>          return;
>>>      }
>>>
>>