[PATCH v3 2/2] drm/xe/devcoredump: Remove IS_ERR_OR_NULL check for kzalloc

Mon Feb 24 21:22:08 UTC 2025

Hi all,
I understand we may need refine the log print.
Could we do it with another patch and focus on the change in current patchset only?
Thanks.

Shuicheng

> -----Original Message-----
> From: Harrison, John C <john.c.harrison at intel.com>
> Sent: Thursday, February 20, 2025 5:36 PM
> To: De Marchi, Lucas <lucas.demarchi at intel.com>; Wajdeczko, Michal
> <Michal.Wajdeczko at intel.com>
> Cc: Lin, Shuicheng <shuicheng.lin at intel.com>; intel-xe at lists.freedesktop.org
> Subject: Re: [PATCH v3 2/2] drm/xe/devcoredump: Remove IS_ERR_OR_NULL
> check for kzalloc
> 
> On 2/20/2025 15:54, Lucas De Marchi wrote:
> > On Thu, Feb 20, 2025 at 05:29:56PM +0100, Michal Wajdeczko wrote:
> >> On 20.02.2025 01:17, Shuicheng Lin wrote:
> >>> kzalloc returns a valid pointer or NULL if the allocation fails.
> >>> It never returns an error pointer. It is better to check for NULL
> >>> directly.
> >>>
> >>> Signed-off-by: Shuicheng Lin <shuicheng.lin at intel.com>
> >>> Cc: John Harrison <John.C.Harrison at Intel.com>
> >>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> >>> ---
> >>>  drivers/gpu/drm/xe/xe_devcoredump.c | 4 ++--
> >>>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c
> >>> b/drivers/gpu/drm/xe/xe_devcoredump.c
> >>> index 60d15e455017..81b9d9bb3f57 100644
> >>> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
> >>> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
> >>> @@ -426,8 +426,8 @@ void xe_print_blob_ascii85(struct drm_printer
> >>> *p, const char *prefix, char suffi
> >>>          drm_printf(p, "Offset not word aligned: %zu", offset);
> >>>
> >>>      line_buff = kzalloc(DMESG_MAX_LINE_LEN, GFP_KERNEL);
> >>> -    if (IS_ERR_OR_NULL(line_buff)) {
> >>> -        drm_printf(p, "Failed to allocate line buffer: %pe",
> >>> line_buff);
> >>> +    if (!line_buff) {
> >>> +        drm_printf(p, "Failed to allocate line buffer\n");
> >>
> >> btw, since this line will be included in the output, where one could
> >> expect ascii85 data, shouldn't we print that diagnostic message with
> >> some special prefix to make it clear there is nothing to parse? like
> >>
> >>     "# Failed to allocate internal data\n"
> >>
> >> also since caller may have already provided a prefix, shouldn't we
> >> also include it in this diagnostic message?
> >>
> >>     "%s%s# Failed to allocate internal data\n",
> >>     prefix ?: "",
> >>     prefix ? ": " : ""
> >
> > or stop printing and return an error. we are missing the `.error: ...`
> > already that is used in other places.
> >
> > $ git grep '\.error: ' -- drivers/gpu/drm/xe
> > drivers/gpu/drm/xe/xe_vm.c:             drm_printf(p, "[0].error:
> > %li\n", PTR_ERR(snap));
> > drivers/gpu/drm/xe/xe_vm.c:                     drm_printf(p,
> > "[%llx].error: %li\n", snap->snap[i].ofs,
> This is the place that should be printing an error. The whole point of this helper
> is that it wraps up all the blob output. However, do we need to distinguish
> between a non-capture-process error (e.g. bad VM object) versus an error in
> the capture itself (e.g. out of memory converting the binary data to a text
> string)?
> 
> Not sure what error routes there are in the VM capture? Are they things that
> are important to include in the devcoredump because they have significant
> meaning about what caused the hang? Or are the only possible errors related
> to the capture process itself - failing to allocate memory to store the capture or
> such?
> 
> If the only errors are capture related then yes, just change this line to print
> "[%prefix].error: %errno\n". But if there is use to distinguish between bad VM
> objects and failed captures, then maybe this one should be
> "[%prefix].capture_error: %errno\n" or something?
> 
> John.
> 
> 
> >
> > Lucas De Marchi
> >
> >
> >
> >
> >>
> >>>          return;
> >>>      }
> >>>
> >>