[Intel-xe] [RFC] drm/xe: Introduce xe_ASSERT macros

Michal Wajdeczko michal.wajdeczko at intel.com
Fri Jul 28 20:33:29 UTC 2023



On 28.07.2023 21:47, Vivi, Rodrigo wrote:
> On Fri, 2023-07-28 at 21:25 +0200, Michal Wajdeczko wrote:
>> As we are moving away from the controversial XE_BUG_ON macro,
>> relying just on WARN_ON or drm_err does not cover the cases
>> where we want to annotate functions with additional detailed
>> debug checks to assert that all prerequisites are satisfied,
>> without paying footprint or performance penalty on non-debug
>> builds, where all misuses introduced during code integration
>> were already fixed.
>>
>> Introduce family of xe_ASSERT macros that try to follow classic
>> assert() utility and can be compiled out on non-debug builds.
> 
> This is indeed much better and clear then the XE_BUG_ON.
> 
> Also, let me be clear that my intention with reply is not to
> block the introduction of this assert macro.

thanks

> 
> But I'd like to raise my concern(s?) with such assert macros:
> 
> In many cases that we currently use these macros are actually
> very useful on production build. Whenever we receive bug reports
> we should be able to ask the reporter to get a dmesg or enable
> a drm.debug and get a message and we could see these, without
> ever having to ask the report to rebuild the kernel with a debug
> config.

you should think of xe_assert() as extension to static_assert() but
aimed to be executed during internal CI builds only, when we run basic
integration tests, and were we are able to quickly catch all misuses,
caused by rapid or multiple changes going in.

> 
> Then, we add this in the code and we start using this everywhere
> without stopping to think if that would be good in a report and
> we end up with a code full of asserts and big problem to get the
> useful information from the bug reports.

those asserts shouldn't fire beyond integration builds - if someone uses
assert() to tag unfinished work, then it's simply wrong, in such cases
true WARN and correct fallback shall be implemented - and that's the
role of the reviewers to spot it ;)

> 
> Aside from the fact that we end up going far from other drm and
> non-drm kernel drivers and starting from day 0 a driver full of
> driver-isms.

maybe because our driver(s) are bigger and more complex ? also much more
developers are working on it, with multiple refactoring going constantly
on, so we should have mechanism to allow us catching unexpected uses
without polluting production driver too much.

btw, DRM already has assert() but named as bug()

[1] https://elixir.bootlin.com/linux/latest/C/ident/DRM_MM_BUG_ON

> 
> I would prefer we have a driver where we make usage of the all
> the drm error, warning, debug, info variants on a case by case,
> carefully analyzing the needs and thinking about the bug reports
> that we might receive from end users.

+1

it would be also nice if driver is correctly handling all errors or
invalid input paths, but at the same time, it shouldn't duplicate checks
that shall be already done at higher layer/caller - with asserts() we
just express our expectations without ignoring other bugs by trying
handle that unexpected case down the stack

> 
>>
>> Macros are based on drm_WARN, but unlikely to origin, disallow
>> use in expressions since we will compile that code out.
>>
>> As we are operating on the xe pointers, we can print additional
>> information about the device, like tile or GT identifier, that
>> is not available from generic WARN report:
>>
>> [ ] xe 0000:00:02.0: [drm] Assertion `false` failed!
>>     graphics: Xe_LP 12.0 media: Xe_M 12.0 GT0
>> [ ] WARNING: CPU: 4 PID: 8442 at drivers/gpu/drm/xe/xe_device.c:280
>> xe_device_probe+0x2f5/0x4a0 [xe]
>> [ ] RIP: 0010:xe_device_probe+0x2f5/0x4a0 [xe]
>>
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> Cc: Francois Dugast <francois.dugast at intel.com>
>> Cc: Matthew Brost <matthew.brost at intel.com>
>> ---
>>  drivers/gpu/drm/xe/xe_assert.h | 47
>> ++++++++++++++++++++++++++++++++++
>>  1 file changed, 47 insertions(+)
>>  create mode 100644 drivers/gpu/drm/xe/xe_assert.h
>>
>> diff --git a/drivers/gpu/drm/xe/xe_assert.h
>> b/drivers/gpu/drm/xe/xe_assert.h
>> new file mode 100644
>> index 000000000000..65306768d637
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_assert.h
>> @@ -0,0 +1,47 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +#ifndef __XE_ASSERT_H__
>> +#define __XE_ASSERT_H__
>> +
>> +#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
>> +
>> +#define xe_ASSERT_MSG(xe, condition, fmt, arg...)
>> ({                                           \
>> +       struct xe_device *__xe =
>> (xe);                                                          \
>> +       int __passed =
>> !!(condition);                                                       
>>     \
>> +       drm_WARN(&__xe->drm, !__passed, "[" DRM_NAME "] Assertion
>> `%s` failed!\n"               \
>> +                "graphics: %s %u.%u media: %s %u.%u "
>> fmt,                                     \
>> +               
>> __stringify(condition),                                              
>>           \
>> +                __xe-
>>> info.graphics_name,                                                 
>>      \
>> +                __xe->info.graphics_verx100 / 100, __xe-
>>> info.graphics_verx100 % 100,          \
>> +                __xe-
>>> info.media_name,                                                    
>>      \
>> +                __xe->info.media_verx100 / 100, __xe-
>>> info.media_verx100 % 100,                \
>> +                ##
>> arg);                                                                
>>        \
>> +       (void)(__passed);                                            
>>                            \
>> +})
>> +
>> +#else
>> +
>> +#define xe_ASSERT_MSG(xe, condition, fmt, arg...)
>> ({                                           \
>> +       typecheck(struct xe_device *,
>> xe);                                                      \
>> +       BUILD_BUG_ON_INVALID(condition);                             
>>                            \
>> +})
>> +
>> +#endif
>> +
>> +#define xe_ASSERT(xe, condition) \
>> +       xe_ASSERT_MSG((xe), condition, "")
>> +
>> +#define xe_tile_ASSERT(tile, condition)
>> ({                                                     \
>> +       struct xe_tile *__tile =
>> (tile);                                                        \
>> +       xe_ASSERT_MSG(tile_to_xe(__tile), condition, "tile%u ",
>> __tile->id);                    \
>> +})
>> +
>> +#define xe_gt_ASSERT(gt, condition)
>> ({                                                         \
>> +       struct xe_gt *__gt =
>> (gt);                                                              \
>> +       xe_ASSERT_MSG(gt_to_xe(__gt), condition, "GT%u ", __gt-
>>> info.id);                       \
>> +})
>> +
>> +#endif /* __XE_ASSERT_H__ */
> 


More information about the Intel-xe mailing list