[PATCH v9 3/4] drm/xe/guc: Add capture size check in GuC log buffer

Wed Jun 19 22:56:07 UTC 2024


On 2024-06-19 6:28 p.m., Michal Wajdeczko wrote:
> 
> 
> On 19.06.2024 21:44, Dong, Zhanjun wrote:
>> Please see my comments inline below.
>>
>> Zhanjun
>>
>> On 2024-06-14 8:13 a.m., Michal Wajdeczko wrote:
>>>
>>>
>>> On 07.06.2024 02:07, Zhanjun Dong wrote:
>>>> The capture-nodes is included in GuC log buffer, add the size check
>>>> for capture region in the whole GuC log buffer.
>>>> Add capture output size check before allocating the shared buffer.
>>>>
>>>> Signed-off-by: Zhanjun Dong <zhanjun.dong at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_gt_printk.h     |   3 +
...

>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h
>>>> b/drivers/gpu/drm/xe/xe_guc_fwif.h
>>>> index 04b03c398191..908298791c93 100644
>>>> --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
>>>> +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
>>>> @@ -250,6 +250,54 @@ struct guc_engine_usage {
>>>>        struct guc_engine_usage_record
>>>> engines[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
>>>>    } __packed;
>>>>    +/* GuC logging structures */
>>>> +
>>>> +enum guc_log_buffer_type {
>>>> +    GUC_DEBUG_LOG_BUFFER,
>>>> +    GUC_CRASH_DUMP_LOG_BUFFER,
>>>> +    GUC_CAPTURE_LOG_BUFFER,
>>>> +    GUC_MAX_LOG_BUFFER
>>>
>>> this last enumerator is not real buffer type, so better at least name it
>>> in a different way (at least add __ prefix?
>>>
>>> or best, since it looks like ABI definitions, just move it out of the
>>> enum to benefit from compiler checks that John prefers:
>>>
>>> enum guc_log_buffer_type {
>>>      GUC_LOG_BUFFER_DEBUG        = 0,
>>>      GUC_LOG_BUFFER_CRASH_DUMP    = 1,
>>>      GUC_LOG_BUFFER_CAPTURE_LOG    = 2,
>>> };
>>> #define NUM_GUC_LOG_BUFFER_TYPES    3
>>>
>> Will changed to:
>> /* GuC logging structures */
>> enum guc_log_buffer_type {
>>      GUC_DEBUG_LOG_BUFFER,
>>      GUC_CRASH_DUMP_LOG_BUFFER,
>>      GUC_CAPTURE_LOG_BUFFER,
>>
>>      GUC_LOG_BUFFER_TYPE_MAX
>> };
>> The empty line speprate real type vs MAX
> 
> empty line is not enough to let the compiler see the difference between
> real _type_ and _max_ definition
> 
>> I would prefer to stay within the enum, which make add new type easier,
>> the MAX will be updated by compiler automatically, no need to manually
>> add 1 to NUM_xxx macro.
> 
> but this is (or should be) a stable ABI so we shouldn't rely on the
> compiler to automatically making any changes to any of these defs.
> 
> and as of today we have exactly 3 types defined as:
> 
>      GUC_LOG_BUFFER_DEBUG        = 0,
>      GUC_LOG_BUFFER_CRASH_DUMP    = 1,
>      GUC_LOG_BUFFER_CAPTURE_LOG    = 2,
> 
> so if you want to use enum to get some help from the compiler, like to
> check that in your implementation you are not using bad types, then
> there should be no other enumerators in this enum, as otherwise you kill
> that feature
> 
> also note that your approach will not work if for some reason the new
> type will be defined as something different than 3, like:
> 
>      GUC_LOG_BUFFER_FUTURE    = 345
> 
> so while will have 4 TYPES, your automatic MAX will be 346, making it
> almost useless
That's true, it is a stable ABI.
Will be changed to John style.

> 
>>
>>>> +};
>>>> +
>>>> +/*
>>>> + * struct guc_log_buffer_state - GuC log buffer state
>>>
>>> this is not a kernel-doc format, intentional or typo ?
>>>
>>>> + *
>>>> + * Below state structure is used for coordination of retrieval of
>>>> GuC firmware
>>>> + * logs. Separate state is maintained for each log buffer type.
>>>> + * read_ptr points to the location where Xe read last in log buffer and
>>>> + * is read only for GuC firmware. write_ptr is incremented by GuC
>>>> with number
>>>> + * of bytes written for each log entry and is read only for Xe.
>>>> + * When any type of log buffer becomes half full, GuC sends a flush
>>>> interrupt.
>>>> + * GuC firmware expects that while it is writing to 2nd half of the
>>>> buffer,
>>>> + * first half would get consumed by Host and then get a flush completed
>>>> + * acknowledgment from Host, so that it does not end up doing any
>

...

>>>> +    }
>>>> +
>>>> +    return overflow;
>>>> +}
>>>> +
>>>> +unsigned int xe_guc_get_log_buffer_size(struct xe_guc_log *log,
>>>> +                    enum guc_log_buffer_type type)
>>>> +{
>>>> +    switch (type) {
>>>> +    case GUC_DEBUG_LOG_BUFFER:
>>>> +        return xe_guc_log_section_size_debug(log);
>>>> +    case GUC_CRASH_DUMP_LOG_BUFFER:
>>>> +        return xe_guc_log_section_size_crash(log);
>>>> +    case GUC_CAPTURE_LOG_BUFFER:
>>>> +        return xe_guc_log_section_size_capture(log);
>>>> +    default:
>>>> +        MISSING_CASE(type);
>>>
>>> there should be no need for 'default' case if you properly define your
>>> enum type
>> Compiler could check static error, while 'default' here can cover
>> run-time errors. I would prefer to have it for public funcitons.
> 
> but this is still driver only function, called only from your code,
> where you should be using only valid enumerators, so no need for any
> random runtime protection (unless you already abuse this function by
> using random integers as params)
> 
By switch to John style, this default case could be removed.

> 
>>>
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +size_t xe_guc_get_log_buffer_offset(struct xe_guc_log *log,
>>>> +                    enum guc_log_buffer_type type)
>>>> +{
>>>> +    enum guc_log_buffer_type i;
>>>> +    size_t offset = PAGE_SIZE;/* for the log_buffer_states */
>>>> +
>>>> +    for (i = GUC_DEBUG_LOG_BUFFER; i < GUC_MAX_LOG_BUFFER; ++i) {
>>>> +        if (i == type)
>>>> +            break;

...