[Intel-gfx] [RFC PATCH] drm/i915/guc: New interface files for GuC starting in Gen11
Srivatsa, Anusha
anusha.srivatsa at intel.com
Thu Jun 14 22:48:01 UTC 2018
>-----Original Message-----
>From: Mateo Lozano, Oscar
>Sent: Wednesday, June 13, 2018 3:08 PM
>To: Wajdeczko, Michal <Michal.Wajdeczko at intel.com>; intel-
>gfx at lists.freedesktop.org
>Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>; Rogovin, Kevin
><kevin.rogovin at intel.com>; Spotswood, John A <john.a.spotswood at intel.com>;
>Srivatsa, Anusha <anusha.srivatsa at intel.com>; Ceraolo Spurio, Daniele
><daniele.ceraolospurio at intel.com>; Thierry, Michel <michel.thierry at intel.com>;
>Chris Wilson <chris at chris-wilson.co.uk>; Winiarski, Michal
><michal.winiarski at intel.com>; Lis, Tomasz <tomasz.lis at intel.com>; Ewins, Jon
><jon.ewins at intel.com>; Sundaresan, Sujaritha
><sujaritha.sundaresan at intel.com>; Patel, Jalpa <jalpa.patel at intel.com>; Li,
>Yaodong <yaodong.li at intel.com>
>Subject: Re: [RFC PATCH] drm/i915/guc: New interface files for GuC starting in
>Gen11
>
>
>
>On 5/29/2018 7:59 AM, Michal Wajdeczko wrote:
>> Hi,
>>
>> On Fri, 25 May 2018 23:59:35 +0200, Oscar Mateo <oscar.mateo at intel.com>
>> wrote:
>>
>>> GuC interface has been redesigned (or cleaned up, rather) starting
>>> with Gen11, as a stepping stone towards a new branching strategy
>>> that helps maintain backwards compatibility with previous Gens, as
>>> well as sideward compatibility with other OSes.
>>>
>>> The interface is split in two header files: one for the KMD and one
>>> for clients of the GuC (which, in our case, happens to be the KMD
>>> as well). SLPC interface files will come at a later date.
>>>
>>> Could we get eyes on the new interface header files, to make sure the
>>> GuC team is moving in the right direction?
>>>
>>> Signed-off-by: Oscar Mateo <oscar.mateo at intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
>>> Cc: Kevin Rogovin <kevin.rogovin at intel.com>
>>> Cc: John A Spotswood <john.a.spotswood at intel.com>
>>> Cc: Anusha Srivatsa <anusha.srivatsa at intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>>> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>> Cc: Michel Thierry <michel.thierry at intel.com>
>>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Michał Winiarski <michal.winiarski at intel.com>
>>> Cc: Tomasz Lis <tomasz.lis at intel.com>
>>> Cc: Jon Ewins <jon.ewins at intel.com>
>>> Cc: Sujaritha Sundaresan <sujaritha.sundaresan at intel.com>
>>> Cc: Jalpa Patel <jalpa.patel at intel.com>
>>> Cc: Jackie Li <yaodong.li at intel.com>
>>> ---
>>> drivers/gpu/drm/i915/intel_guc_client_interface.h | 255 +++++++
>>> drivers/gpu/drm/i915/intel_guc_kmd_interface.h | 847
>>> ++++++++++++++++++++++
>>> 2 files changed, 1102 insertions(+)
>>> create mode 100644 drivers/gpu/drm/i915/intel_guc_client_interface.h
>>> create mode 100644 drivers/gpu/drm/i915/intel_guc_kmd_interface.h
>>
>> can we name these files as:
>>
>> drivers/gpu/drm/i915/intel_guc_interface.h
>> drivers/gpu/drm/i915/intel_guc_interface_client.h
>> or
>> drivers/gpu/drm/i915/intel_guc_defs.h
>> drivers/gpu/drm/i915/intel_guc_defs_client.h
>> or
>> drivers/gpu/drm/i915/guc/guc.h
>> drivers/gpu/drm/i915/guc/guc_client.h
>
>I'm fine with any of these names.
>
>>
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_guc_client_interface.h
>>> b/drivers/gpu/drm/i915/intel_guc_client_interface.h
>>> new file mode 100644
>>> index 0000000..1ef91a7
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/i915/intel_guc_client_interface.h
>>> @@ -0,0 +1,255 @@
>>> +/*
>>> + * SPDX-License-Identifier: MIT
>>> + *
>>> + * Copyright © 2018 Intel Corporation
>>> + */
>>> +
>>> +#ifndef _INTEL_GUC_CLIENT_INTERFACE_H_
>>> +#define _INTEL_GUC_CLIENT_INTERFACE_H_
>>> +
>>
>> need to include types.h for u32
>
>Will do.
>
>>
>>> +#pragma pack(1)
>>> +
>>>
>+/****************************************************************
>*************
>>>
>>> + ********************************** Engines
>>> **********************************
>>> +
>>>
>******************************************************************
>***********/
>>
>> no fancy markups, please
>>
>
>Ok.
>
>>> +
>>> +#define GUC_MAX_ENGINE_INSTANCE_PER_CLASS 4
>>> +#define GUC_MAX_SCHEDULABLE_ENGINE_CLASS 5
>>> +#define GUC_MAX_ENGINE_CLASS_COUNT 6
>>> +#define GUC_ENGINE_INVALID 6
>>
>> hmm, why not 7 or 127 ?
>> maybe if we need value for INVALID we should use 0 or -1 (~0)
>
>I'll pass this comment to the GuC team.
>
>>
>>> +
>>> +/* Engine Class that uKernel can schedule on. This is just a SW
>>> enumeration.
>>> + * HW configuration will depend on the Platform and SKU
>>> + */
>>> +enum uk_engine_class {
>>
>> why there is new prefix "uk" ?
>
>uk stands for uKernel. In this case, I'm guessing it is used to
>differentiate between the engine class defined by hardware vs. the one
>defined by the uKernel.
>>
>>> + UK_RENDER_ENGINE_CLASS = 0,
>>> + UK_VDECENC_ENGINE_CLASS = 1,
>>> + UK_VE_ENGINE_CLASS = 2,
>>> + UK_BLT_COPY_ENGINE_CLASS = 3,
>>> + UK_RESERVED_ENGINE_CLASS = 4,
>>> + UK_OTHER_ENGINE_CLASS = 5,
>>
>> either use valid names or drop RESERVED/OTHER as values
>> from 0 to GUC_MAX_ENGINE_CLASS_COUNT are 'reserved' by
>> definition unless explicitly defined ;)
>
>I'll drop them.
>
>>
>>> +};
>>
>> as we don't use enum in binary struct definitions, then maybe we
>> should define all engine classes with #define as:
>>
>> #define ENGINE_CLASS_INVALID 0
>> #define ENGINE_CLASS_ALL 0
>> #define ENGINE_CLASS_RENDER 1
>> #define ENGINE_CLASS_VDECENC 2
>> ...
>> #define ENGINE_CLASS_MAX 7
>>
>
>I can pass this comment to the GuC team. Or we can use defines for the
>Linux header files, but then we might start deviating again from a
>common interface naming.
>
>> what if future HW will support more than 7 engine classes
>> or they will be so different that they deserve separate id?
>> why
>
>I imagine that's what the reserved is for. I cannot think of many more
>engine classes, but I probably lack imagination.
>
>>
>>> +
>>> +/* Engine Instance that uKernel can schedule on */
>>> +enum uk_engine_instance {
>>> + UK_ENGINE_INSTANCE_0 = 0,
>>> + UK_ENGINE_INSTANCE_1 = 1,
>>> + UK_ENGINE_INSTANCE_2 = 2,
>>> + UK_ENGINE_INSTANCE_3 = 3,
>>> + UK_INVALID_ENGINE_INSTANCE =
>GUC_MAX_ENGINE_INSTANCE_PER_CLASS,
>>> + UK_ENGINE_ALL_INSTANCES = UK_INVALID_ENGINE_INSTANCE
>>> +};
>>
>> I'm not sure why we would need this enum as we already have
>> GUC_MAX_ENGINE_INSTANCE_PER_CLASS and can easily identify
>> instance as [0 ... GUC_MAX_ENGINE_INSTANCE_PER_CLASS), or
>> maybe more intuitive would be use normal indexing and use 0
>> to indicate INVALID/AUTO/ALL instance ?
>>
>> #define ENGINE_INSTANCE_INVALID 0
>> #define ENGINE_INSTANCE_ALL 0
>> #define ENGINE_INSTANCE_MAX 4
>>
>
>I can pass this comment along.
>
>>> +
>>> +/* Target Engine field used in WI header and Guc2Host */
>>> +struct guc_target_engine {
>>> + union {
>>> + struct {
>>> + /* One of enum uk_engine_class */
>>> + u8 engine_class : 3;
>>
>> enum defines only 0..5 while in these 3 bits we can pass 0..7
>>
>
>So? You can pass 6 and 7 if you want, but it would be invalid (as the
>enum shows).
>
>>> + /* One of enum uk_engine_instance */
>>> + u8 engine_instance : 4;
>>
>> again, mismatch between 'enum' and bitfields.
>
>Again, I don't understand the problem.
>
>>
>>> + /* All enabled engine classes and instances */
>>> + u8 all_engines : 1;
>>
>> inconsistency as there is dedicated value UK_ENGINE_ALL_INSTANCES
>> for engine_instance but for engine_class there is separate bit.
>
>I'll pass this comment along.
>
>>
>>> + };
>>> + u8 value;
>>> + };
>>> +};
>>> +
>>> +struct guc_engine_class_bit_map {
>>> + union {
>>> + /* Bit positions must match enum uk_engine_class value */
>>> + struct {
>>> + u32 render_engine_class : 1;
>>> + u32 vdecenc_engine_class : 1;
>>> + u32 ve_engine_class : 1;
>>> + u32 blt_copy_engine_class : 1;
>>> + u32 reserved_engine_class : 1;
>>> + u32 other_engine_class : 1;
>>> + u32 : 26;
>>
>> btw, iirc nameless bit fields may not correctly initialized on some
>> compilers, so better to avoid them.
>> also, do we want to use bitfields or stay with _SHIFT/_MASK macros?
>
>At some point we agreed that we were trying to keep the header files
>from different OSes as close as possible.
>IIRC, I raised the point that _SHIFT/_MASK macros are the preferred
>approach in i915, and I was told this is an exceptional circumstance.
>
>>
>>> + };
>>> + u32 value;
>>> + };
>>> +};
>>> +
>>> +struct guc_engine_instance_bit_map {
>>> + union {
>>> + struct {
>>> + u32 engine0 : 1;
>>> + u32 engine1 : 1;
>>> + u32 engine2 : 1;
>>> + u32 engine3 : 1;
>>> + u32 engine4 : 1;
>>> + u32 engine5 : 1;
>>> + u32 engine6 : 1;
>>> + u32 engine7 : 1;
>>
>> "engine" ? maybe they mean "instance" ?
>> and if instance, then enum above defines only 0..3
>>
>
>I'll pass the comment along.
>
>>> + u32 : 24;
>>> + };
>>> + u32 value;
>>> + };
>>> +};
>>> +
>>> +struct guc_engine_bit_map {
>>> + struct guc_engine_class_bit_map engine_class_bit_map;
>>> + struct guc_engine_instance_bit_map
>>> + engine_instance_bit_map[GUC_MAX_ENGINE_CLASS_COUNT];
>>> +};
>>> +
>>>
>+/****************************************************************
>*************
>>>
>>> + ********************* Process Descriptor and Work Queue
>>> *********************
>>> +
>>>
>******************************************************************
>***********/
>>> +
>>> +/* Status of a Work Queue */
>>> +enum guc_queue_status {
>>> + GUC_QUEUE_STATUS_ACTIVE = 1,
>>> + GUC_QUEUE_STATUS_SUSPENDED = 2,
>>> + GUC_QUEUE_STATUS_CMD_ERROR = 3,
>>> + GUC_QUEUE_STATUS_ENGINE_ID_NOT_USED = 4,
>>> + GUC_QUEUE_STATUS_SUSPENDED_FROM_ENGINE_RESET = 5,
>>> + GUC_QUEUE_STATUS_INVALID_STATUS = 6,
>>
>> why not 0 ? what 0 means ?
>
>IPTCA (I'll pass the comment along)
>
>>
>>> +};
>>> +
>>> +/* Priority of guc_context_descriptor */
>>> +enum guc_context_priority {
>>> + GUC_CONTEXT_PRIORITY_KMD_HIGH = 0,
>>> + GUC_CONTEXT_PRIORITY_HIGH = 1,
>>> + GUC_CONTEXT_PRIORITY_KMD_NORMAL = 2,
>>> + GUC_CONTEXT_PRIORITY_NORMAL = 3,
>>> + GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT = 4,
>>> + GUC_CONTEXT_PRIORITY_INVALID =
>>> GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT
Is this correct? priority of invalid context being same as GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT ?
>>> +};
>>> +/* A shared structure between app and uKernel for communication */
>>> +struct guc_sched_process_descriptor {
>>> + /* Index in the GuC Context Descriptor Pool */
>>> + u32 context_id;
>>> +
>>> + /* Pointer to doorbell cacheline. BSpec: 1116 */
>>> + u64 p_doorbell;
>>
>> pointer ? maybe s/p_doorbell/doorbell_addr as this likely is phy addr
>>
>
>IPTCA
>
>>> +
>>> + /* WQ Head Byte Offset - Client must not write here */
>>> + u32 head_offset;
s/ head_offset/wq_head_offset?
>>> + /* WQ Tail Byte Offset - uKernel will not write here */
>>> + u32 tail_offset;
s/tail_offset/wq_tail_offset?
>>> + /* WQ Error Byte offset */
>>> + u32 error_offset_byte;
Same here.
>>> + /* WQ pVirt base address in Client. For use only by Client */
>>> + u64 wqv_base_address;
>>
>> client virtual address shall not be part of the GuC ABI
>
>Hmmm... if I understand this correctly, this is intended for Direct
>Submission.
>
>>
>>> +
>>> + /* WQ Size in Bytes */
>>> + u32 wq_size_bytes;
>>> +
>>> + /* WQ Status. Read by Client. Written by uKernel/KMD */
>>> + enum guc_queue_status wq_status;
>>
>> we have many discussions about not using enums in packed structs
>
>Yes, and they were noted and tickets opened. They simply haven't
>happened yet.
>
>>
>>> +
>>> + /* Context priority. Read only by Client */
>>> + enum guc_context_priority priority_assigned;
>>
>> same here
>
>Same here.
>
>>
>>> +
>>> + u32 future;
Looks confusing. Is this to be used in future?
>> if we really need this, can it be merged with reserved0 ?
>
>It could, but this helps keep header files from different OSes as close
>as possible.
>
>>
>>> +
>>> + struct guc_engine_class_bit_map queue_engine_error;
>>> +
>>> + u32 reserved0[3];
>>> +
>>> + /* uKernel side tracking for debug */
>>
>> this should be separate struct definition
>
>IPTCA
>
>>
>>> +
>>> + /* Written by uKernel at the time of parsing and successful removal
>>> + * from WQ (implies ring tail was updated)
>>> + */
>>> + u32 total_work_items_parsed_by_guc;
>>> +
>>> + /* Written by uKernel if a WI was collapsed if next WI is the same
>>> + * LRCA (optimization applies only to Secure/KMD contexts)
>>> + */
>>> + u32 total_work_items_collapsed_by_guc;
>>> +
>>> + /* Tells if the context is affected by Engine Reset. UMD needs to
>>> + * clear it after taking appropriate Action (TBD)
>>> + */
>>> + u32 is_context_in_engine_reset;
>>> +
>>> + /* WQ Sampled tail at Engine Reset Time. Valid only if
>>> + * is_context_in_engine_reset = true
>>> + */
>>> + u32 engine_reset_sampled_wq_tail;
>>> +
>>> + /* Valid from engine reset until all the affected Work Items are
>>> + * processed
>>> + */
>>> + u32 engine_reset_sampled_wq_tail_valid;
>>> +
>>> + u32 reserved1[15];
>>> +};
>>> +
>>> +/* Work Item for submitting KMD workloads into Work Queue for GuC */
>>> +struct guc_sched_work_queue_kmd_element_info {
>>> + /* Execlist context descriptor's lower DW. BSpec: 12254 */
>>> + u32 element_low_dw;
>>> + union {
>>> + struct {
>>> + /* SW Context ID. BSpec: 12254 */
>>> + u32 sw_context_index : 11;
>>> + /* SW Counter. BSpec: 12254 */
>>> + u32 sw_context_counter : 6;
>>> + /* If this workload needs to be synced prior
>>> + * to submission use context_submit_sync_value and
>>> + * context_submit_sync_address
>>> + */
>>> + u32 needs_sync : 1;
>>> + /* QW Aligned, TailValue <= 2048
>>> + * (addresses 4 pages max)
>>> + */
>>> + u32 ring_tail_qw_index : 11;
>>> + u32 : 2;
>>> + /* Bit to indicate if POCS needs to be in FREEZE state
>>> + * for this WI submission
>>> + */
>>> + u32 wi_freeze_pocs : 1;
>>> + };
>>> + u32 value;
>>> + } element_high_dw;
>>> +};
>>> +
>>> +/* Work Item instruction type */
>>> +enum guc_sched_instruction_type {
>>> + GUC_SCHED_INSTRUCTION_BATCH_BUFFER_START = 0x1,
>>> + GUC_SCHED_INSTRUCTION_GUC_CMD_PSEUDO = 0x2,
>>> + GUC_SCHED_INSTRUCTION_GUC_CMD_KMD = 0x3,
>>> + GUC_SCHED_INSTRUCTION_GUC_CMD_NOOP = 0x4,
>>> + GUC_SCHED_INSTRUCTION_RESUME_ENGINE_WQ_PARSING = 0x5,
>>> + GUC_SCHED_INSTRUCTION_INVALID = 0x6,
>>> +};
>>> +
>>> +/* Header for every Work Item put in the Work Queue */
>>> +struct guc_sched_work_queue_item_header {
>>> + union {
>>> + struct {
>>> + /* One of enum guc_sched_instruction_type */
>>> + u32 work_instruction_type : 8;
>>> + /* struct guc_target_engine */
>>> + u32 target_engine : 8;
>>> + /* Length in number of dwords following this header */
>>> + u32 command_length_dwords : 11;
>>> + u32 : 5;
>>> + };
>>> + u32 value;
>>> + };
>>> +};
>>> +
>>> +/* Work item for submitting KMD workloads into work queue for GuC */
>>> +struct guc_sched_work_queue_item {
>>> + struct guc_sched_work_queue_item_header header;
>>> + struct guc_sched_work_queue_kmd_element_info
>>> kmd_submit_element_info;
>>> + /* Debug only */
>>> + u32 fence_id;
>>> +};
>>> +
>>> +struct km_gen11_resume_work_queue_processing_item {
>>> + struct guc_sched_work_queue_item header;
>>> +};
>>> +
>>> +#pragma pack()
>>> +
>>> +#endif
>>> diff --git a/drivers/gpu/drm/i915/intel_guc_kmd_interface.h
>>> b/drivers/gpu/drm/i915/intel_guc_kmd_interface.h
>>> new file mode 100644
>>> index 0000000..4c643ad
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/i915/intel_guc_kmd_interface.h
>>> @@ -0,0 +1,847 @@
>>> +/*
>>> + * SPDX-License-Identifier: MIT
>>> + *
>>> + * Copyright © 2018 Intel Corporation
>>> + */
>>> +
>>> +#ifndef _INTEL_GUC_KMD_INTERFACE_H_
>>> +#define _INTEL_GUC_KMD_INTERFACE_H_
>>> +
>>> +#include "intel_guc_client_interface.h"
>>> +
>>> +#pragma pack(1)
>>> +
>>> +/* Maximum number of entries in the GuC Context Descriptor Pool.
>>> Upper limit
>>> + * restricted by number of 'SW Context ID' bits in the Context
>>> Descriptor
>>> + * (BSpec: 12254) minus some reserved entries
>>> + */
>>> +#define GUC_MAX_GUC_CONTEXT_DESCRIPTOR_ENTRIES 2032
>>> +
>>> +/* Limited by 'SW Counter' bits. BSpec: 12254 */
>>> +#define GUC_MAX_SW_CONTEXT_COUNTER 64
>>> +
>>> +/* Maximum depth of HW Execlist Submission Queue. BSpec: 18934 */
>>> +#define GUC_MAX_SUBMISSION_Q_DEPTH 8
>>> +
>>> +/* Minimum depth of HW Execlist Submission Queue. BSpec: 18934 */
>>> +#define GUC_MIN_SUBMISSION_Q_DEPTH 2
>>> +
>>> +/* Default depth of HW Execlist Submission Queue. BSpec: 18934 */
>>> +#define GUC_DEFAULT_ELEM_IN_SUBMISSION_Q
>GUC_MIN_SUBMISSION_Q_DEPTH
>>> +
>>> +/* 1 Cacheline = 64 Bytes */
>>> +#define GUC_DMA_CACHELINE_SIZE_BYTES 64
>>> +
>>>
>+/****************************************************************
>*************
>>>
>>> + ************************** Engines and System Info
>>> **************************
>>> +
>>>
>******************************************************************
>***********/
>>> +
>>> +/* GT system info passed down by KMD after reading fuse registers */
>>> +struct guc_gt_system_info {
>>> + u32 slice_enabled;
>>> + u32 rcs_enabled;
>>> + u32 future0;
>>> + u32 bcs_enabled;
>>> + u32 vd_box_enable_mask;
>>> + u32 future1;
>>> + u32 ve_box_enable_mask;
>>> + u32 future2;
>>> + u32 reserved[8];
>>
>> what's the format of these u32 ?
>> maybe guc_engine_bit_mask can be reused here ?
>
>Most of them are just booleans. Yes, guc_engine_bit_mask can probably be
>reused. IPTCA
>
>>
>>> +};
>>> +
>>>
>+/****************************************************************
>*************
>>>
>>> + ************************ GuC Context Descriptor Pool
>>> ************************
>>> +
>>>
>******************************************************************
>***********/
>>> +
>>> +/* State of the context */
>>> +struct guc_engine_context_state {
>>> + union {
>>> + struct {
>>> + u32 wait_for_display_event : 1;
>>> + u32 wait_for_semaphore : 1;
>>> + u32 re_enqueue_to_submit_queue : 1;
>>> + u32 : 29;
>>> + };
>>> + u32 wait_value;
>>> + };
>>> + u32 reserved;
>>> +};
>>> +
>>> +/* To describe status and access information of current ring buffer for
>>> + * a given guc_execlist_context
>>> + */
>>> +struct guc_execlist_ring_buffer {
>>> + u32 p_execlist_ring_context;
>>> +
>>> + /* uKernel address for the ring buffer */
>>> + u32 p_ring_begin;
>>> + /* uKernel final byte address that is valid for this ring */
>>> + u32 p_ring_end;
>>> + /* uKernel address for next location in ring */
>>> + u32 p_next_free_location;
>>
>> hmm, uKernel private addresses are part of the ABI ?
>
>This simply means GGTT addresses (there is nothing really private to the
>uKernel)
>
>>
>>> +
>>> + /* Last value written by software for tracking (just in case
>>> + * HW corrupts the tail in its context)
>>> + */
>>> + u32 current_tail_pointer_value;
>>> +};
>>> +
>>> +/* The entire execlist context including software and HW information */
>>> +struct guc_execlist_context {
>>> + /* 2 DWs of Context Descriptor. BSpec: 12254 */
>>> + u32 hw_context_desc_dw[2];
>>> + u32 reserved0;
>>> +
>>> + struct guc_execlist_ring_buffer ring_buffer_obj;
>>> + struct guc_engine_context_state state;
>>> +
>>> + /* Flag to track if execlist context exists in submit queue
>>> + * Valid values 0 or 1
>>> + */
>>> + u32 is_present_in_sq;
>>> +
>>> + /* If needs_sync is set in WI, sync *context_submit_sync_address ==
>>> + * context_submit_sync_value before submitting the context to HW
>>> + */
>>> + u32 context_submit_sync_value;
>>> + u32 context_submit_sync_address;
>>> +
>>> + /* Reserved for SLPC hints (currently used for GT throttle
>>> modes) */
>>> + u32 slpc_context_hints;
>>
>> you said that SLPC will be later ... is it necessary to put part of it
>> here?
>
>"u32 future;" then?
>
>>
>>> +
>>> + u32 reserved1[4];
>>> +};
>>> +
>>> +/* Bitmap to track allocated and free contexts
>>> + * context_alloct_bit_map[n] = 0; Context 'n' free
>>> + * context_alloct_bit_map[n] = 1; Context 'n' allocated
>>> + */
>>> +struct guc_execlist_context_alloc_map {
>>> + /* Bit map for execlist contexts, bits 0 to
>>> + * (GUC_MAX_SW_CONTEXT_COUNTER - 1) are valid
>>> + */
>>> + u64 context_alloct_bit_map;
>>
>> maybe we should use GUC_MAX_SW_CONTEXT_COUNTER here:
>>
>> u32 bitmap[GUC_MAX_SW_CONTEXT_COUNTER / sizeof(u32)];
>>
>
>IPTCA
>
>>> + u32 reserved;
>>> +};
>>> +
>>> +enum guc_context_descriptor_type {
>>> + /* Work will be submitted through doorbell and WQ of a
>>> + * Proxy Submission descriptor in the context desc pool
>>> + */
>>> + GUC_CONTEXT_DESCRIPTOR_TYPE_PROXY_ENTRY = 0x00,
>>> +
>>> + /* Work will be submitted using doorbell and workqueue
>>> + * of this descriptor on behalf of other proxy Entries
>>> + * in the context desc pool
>>> + */
>>> + GUC_CONTEXT_DESCRIPTOR_TYPE_PROXY_SUBMISSION = 0x01,
>>> +
>>> + /* Work is submitted through its own doorbell and WQ */
>>> + GUC_CONTEXT_DESCRIPTOR_TYPE_REAL = 0x02,
>>> +};
>>> +
>>> +/* CPU, Graphics and physical addresses */
>>> +struct guc_address {
>>> + /* Cpu address (virtual) */
>>> + u64 p_cpu_address;
>>> + /* uKernel address (gfx) */
>>> + u32 p_uk_address;
>>
>> do we need to share cpu/uk addresses over ABI ?
>
>What is the alternative?
>
>>
>>> + /* Physical address */
>>> + u64 p_address_gpa;
>>
>> please drop p_ prefix
>
>IPTCA
>
>>
>>> +};
>>> +
>>> +/* Context descriptor for communication between uKernel and KMD */
>>> +struct guc_context_descriptor {
>>> + /* CPU back pointer for general KMD usage */
>>> + u64 assigned_guc_gpu_desc;
>>
>> private pointers shall not be part of the ABI
>
>Agreed in this particular case. Would you be OK with something like "u64
>private_field" here?
>
>>
>>> +
>>> + /* Index in the pool */
>>> + u32 guc_context_desc_pool_index;
>>> +
>>> + /* For a Proxy Entry, this is the index of it's proxy submission
>>> entry.
>>> + * For others this is the same as guc_context_desc_pool_index above
>>> + */
>>> + u32 proxy_submission_guc_context_desc_pool_index;
>>> +
>>> + /* The doorbell page's trigger cacheline */
>>> + struct guc_address doorbell_trigger_address;
>>> +
>>> + /* Assigned doorbell */
>>> + u32 doorbell_id;
>>> +
>>> + /* Array of execlist contexts */
>>> + struct guc_execlist_context
>>> + uk_exec_list_context[GUC_MAX_SCHEDULABLE_ENGINE_CLASS]
>>> + [GUC_MAX_SW_CONTEXT_COUNTER];
>>> +
>>> + /* Allocation map to track which execlist contexts are in use */
>>> + struct guc_execlist_context_alloc_map
>>> + uk_execlist_context_alloc_map[GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
>>
>> hmm, maybe to make it future proof, we should define array as last
>> member?
>
>That would help very little, since this whole struct is also stored in
>an array (something that we still have to improve), but IPTCA.
>
>>
>>> +
>>> + /* Number of active execlist contexts */
>>> + u32 uk_execlist_context_alloc_count;
>>> +
>>> + /* Optimization to reduce the maximum execlist context count for
>>> + * this GuC context descriptor. Should be less than
>>> + * GUC_MAX_SW_CONTEXT_COUNTER
>>> + */
>>> + u32 max_uk_execlist_context_per_engine_class;
>>> +
>>> + union {
>>> + struct {
>>> + /* Is this context actively assigned to an app? */
>>> + u32 is_context_active : 1;
>>> +
>>> + /* Is this a proxy entry, principal or real entry? */
>>> + u32 context_type : 2;
>>> +
>>> + u32 is_kmd_created_context : 1;
>>
>> can it be modified from/or the kmd ?
>
>The KMD sets this. I think this only makes sense when Direct Submission
>is used.
>
>>
>>> +
>>> + /* Context was part of an engine reset. KMD must take
>>> + * appropriate action (this context will not be
>>> + * resubmitted until this bit is cleared)
>>> + */
>>> + u32 is_context_eng_reset : 1;
>>> +
>>> + /* Set it to 1 to prevent other code paths to do work
>>> + * queue processing as we use sampled values for WQ
>>> + * processing. Allowing multiple code paths to do WQ
>>> + * processing will cause same workload to execute
>>> + * multiple times
>>> + */
>>> + u32 wq_processing_locked : 1;
>>> +
>>> + u32 future : 1;
>>> +
>>> + /* If set to 1, the context is terminated by GuC. All
>>> + * the pending work is dropped, its doorbell is evicted
>>> + * and eventually this context will be removed
>>> + */
>>> + u32 is_context_terminated : 1;
>>> +
>>> + u32 : 24;
>>> + };
>>> + u32 bool_values;
>>> + };
>>> +
>>> + enum guc_context_priority priority;
>>
>> again, enum in packed struct
>
>Yup, we already knew about this.
>
>>
>>> +
>>> + /* WQ tail Sampled and set during doorbell ISR handler */
>>> + u32 wq_sampled_tail_offset;
>>> +
>>> + /* Global (across all submit queues). For principals
>>> + * (proxy entry), this will be zero and true count
>>> + * will be reflected in its proxy (proxy submission)
>>> + */
>>> + u32 total_submit_queue_enqueues;
>>> +
>>> + /* Pointer to struct guc_sched_process_descriptor */
>>> + u32 p_process_descriptor;
>>> +
>>> + /* Secure copy of WQ address and size. uKernel can not
>>> + * trust data in struct guc_sched_process_descriptor
>>> + */
>>> + u32 p_work_queue_address;
>>> + u32 work_queue_size_bytes;
>>> +
>>> + u32 future0;
>>> + u32 future1;
Same comment on all "future x" fields. What is the purpose?
Anusha
>> drop or keep together with reserved
>
>Again, it was done like this to keep differences to a minimum.
>
>>
>>> +
>>> + struct guc_engine_class_bit_map queue_engine_error;
>>
>> is this error map per context? is there global error map available?
>
>I'm not very familiar with the GuC reset procedure, but this since to be
>per-context and I don't see any global equivalent thing.
>
>>
>>> +
>>> + u32 reserved0[3];
>>> + u64 reserved1[12];
>>> +};
>>> +
>>>
>+/****************************************************************
>*************
>>>
>>> + *************************** Host2GuC and GuC2Host
>>> ***************************
>>> +
>>>
>******************************************************************
>***********/
>>> +
>>> +/* Host 2 GuC actions */
>>> +enum guc_host2guc_action {
>>
>> to be future ready:
>>
>> s/guc_host2guc_action/guc_action
>> s/GUC_HOST2GUC_ACTION/GUC_ACTION
>> or
>> s/guc_host2guc_action/guc_msg_action
>> s/GUC_HOST2GUC_ACTION/GUC_MSG_ACTION
>> or
>> s/guc_host2guc_action/guc_request_action
>> s/GUC_HOST2GUC_ACTION/GUC_REQUEST_ACTION
>
>IPTCA
>
>>
>>
>>> + GUC_HOST2GUC_ACTION_DEFAULT = 0x0,
>>> + GUC_HOST2GUC_ACTION_REQUEST_INIT_DONE_INTERRUPT = 0x1,
>>> + GUC_HOST2GUC_ACTION_REQUEST_PREEMPTION = 0x2,
>>> + GUC_HOST2GUC_ACTION_REQUEST_ENGINE_RESET = 0x3,
>>> + GUC_HOST2GUC_ACTION_PAUSE_SCHEDULING = 0x4,
>>> + GUC_HOST2GUC_ACTION_RESUME_SCHEDULING = 0x5,
>>> +
>>> + GUC_HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
>>> + GUC_HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
>>> + GUC_HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
>>> + GUC_HOST2GUC_ACTION_ENABLE_LOGGING = 0x40,
>>> + GUC_HOST2GUC_ACTION_CACHE_CRASH_DUMP = 0x200,
>>> + GUC_HOST2GUC_ACTION_DEBUG_RING_DB = 0x300,
>>> + GUC_HOST2GUC_ACTION_PERFORM_GLOBAL_DEBUG_ACTIONS = 0x301,
>>> + GUC_HOST2GUC_ACTION_FORCE_LOGBUFFERFLUSH = 0x302,
>>> + GUC_HOST2GUC_ACTION_LOG_VERBOSITY_LOGOUTPUT_SELECT =
>0x400,
>>> + GUC_HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
>>> + GUC_HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
>>> + GUC_HOST2GUC_ACTION_SET_SCHEDULING_MODE = 0x504,
>>> + GUC_HOST2GUC_ACTION_SCHED_POLICY_CHANGE = 0x506,
>>> +
>>> + /* Actions for Powr Conservation : 0x3000-0x3FFF */
>>> + GUC_HOST2GUC_ACTION_PC_SLPM_REQUEST = 0x3003,
>>> + GUC_HOST2GUC_ACTION_PC_SETUP_GUCRC = 0x3004,
>>> + GUC_HOST2GUC_ACTION_SAMPLE_FORCEWAKE_FEATURE_REGISTER =
>0x3005,
>>> + GUC_HOST2GUC_ACTION_SETUP_GUCRC = 0x3006,
>>> +
>>> + GUC_HOST2GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
>>> +
>>> + GUC_HOST2GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER =
>0x4505,
>>> + GUC_HOST2GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER
>= 0x4506,
>>> +
>>> + GUC_HOST2GUC_ACTION_MAX = 0xFFFF
>>> +};
>>> +
>>> +enum guc_host2guc_response_status {
>>> + GUC_HOST2GUC_RESPONSE_STATUS_SUCCESS = 0x0,
>>> + GUC_HOST2GUC_RESPONSE_STATUS_UNKNOWN_ACTION = 0x30,
>>> + GUC_HOST2GUC_RESPONSE_STATUS_LOG_HOST_ADDRESS_NOT_VALID =
>0x80,
>>> + GUC_HOST2GUC_RESPONSE_STATUS_GENERIC_FAIL = 0xF000,
>>
>> s/guc_host2guc_response_status/guc_status
>> s/GUC_HOST2GUC_RESPONSE_STATUS/GUC_STATUS
>> or
>> s/guc_host2guc_response_status/guc_msg_status
>> s/GUC_HOST2GUC_RESPONSE_STATUS/GUC_MSG_STATUS
>> or
>> s/guc_host2guc_response_status/guc_response_status
>> s/GUC_HOST2GUC_RESPONSE_STATUS/GUC_RESPONSE_STATUS
>>
>
>IPTCA
>
>>> +};
>>> +
>>> +enum {
>>> + /* Host originating types */
>>> + GUC_MSG_TYPE_HOST2GUC_REQUEST = 0x0,
>>> +
>>> + /* GuC originating types */
>>> + GUC_MSG_TYPE_HOST2GUC_RESPONSE = 0xF,
>>> +} GUC_GUC_MSG_TYPE;
>> ^^^^^^^
>> typo?
>
>Problem with the automatic conversion, I'll fix it.
>
>> and HOST2GUC should be dropped
>>
>> GUC_MSG_TYPE_REQUEST = 0x0,
>> GUC_MSG_TYPE_RESPONSE = 0xF,
>>
>> and it should defined before ACTION/STATUS
>>
>>> +
>>> +/* This structure represents the various formats of values put in
>>> + * SOFT_SCRATCH_0. The "Type" field is to determine which register
>>> + * definition to use, so it must be common among all unioned
>>> + * structs.
>>> + */
>>> +struct guc_msg_format {
>>> + union {
>>> + struct {
>>> + u32 action : 16; /* enum guc_host2guc_action */
>>> + u32 reserved : 12; /* MBZ */
>>> + u32 type : 4; /* GUC_MSG_TYPE_HOST2GUC_REQUEST */
>>> + } host2guc_action;
>>> +
>>> + struct {
>>> + u32 status : 16; /* enum guc_host2guc_response_status */
>>> + u32 return_data : 12;
>>> + u32 type : 4; /* GUC_MSG_TYPE_HOST2GUC_RESPONSE */
>>> + } host2guc_response;
>>> +
>>> + u32 dword_value;
>>> + };
>>> +};
>>
>> well, we need just single struct definition:
>>
>> union guc_msg_header {
>> struct {
>> u32 code : 16;
>> u32 data : 12;
>> u32 type : 4; /* GUC_MSG_TYPE */
>> };
>> u32 value;
>> };
>>
>> as value of 'type' field indicates how to interpret 'code/data' fields:
>>
>> if (msg.type == GUC_MSG_TYPE_REQUEST) {
>> GUC_REQUEST_ACTION action = msg.code;
>> } else if (type == GUC_MSG_TYPE_RESPONSE) {
>> GUC_RESPONSE_STATUS status = msg.code;
>> }
>>
>
>IPTCA
>
>>> +
>>> +#define GUC_MAKE_HOST2GUC_RESPONSE(_status, _return_data) \
>>> + ((GUC_MSG_TYPE_HOST2GUC_RESPONSE << 28) | \
>>> + ((_return_data & 0xFFF) << 16) | \
>>> + (_status & 0xFFFF))
>>> +#define GUC_MAKE_HOST2GUC_STATUS(a)
>(GUC_MAKE_HOST2GUC_RESPONSE(a, 0))
>>
>> I'm not sure we need helper macros to be part of the ABI definition
>
>Agreed, I will remove this.
>
>>
>>> +
>>> +enum guc_cmd_transport_buffer_type {
>>> + GUC_CMD_TRANSPORT_BUFFER_HOST2GUC = 0x00,
>>> + GUC_CMD_TRANSPORT_BUFFER_GUC2HOST = 0x01,
>>> + GUC_CMD_TRANSPORT_BUFFER_MAX_TYPE = 0x02,
>>
>> where would we need MAX_TYPE ?
>
>No idea, you are the cmd transport buffer expert :)
>IPTCA
>
>>
>>> +};
>>> +
>>> +struct guc_cmd_transport_buffer_desc {
>>> + u32 buffer_begin_gfx_address;
>>> + u64 buffer_begin_virtual_address;
>>
>> virtual address in ABI again ?
>>
>>> + u32 buffer_size_in_bytes;
>>> + /* GuC uKernel updates this */
>>> + u32 head_offset;
>>> + /* GuC client updates this */
>>> + u32 tail_offset;
>>> + u32 is_in_error;
>>> + /* A DW provided by H2G item that was requested to be written */
>>> + u32 fence_report_dw;
>>> + /* Status associated with above fence_report_dw */
>>> + u32 status_report_dw;
>>
>> hmm, as we usually setup both H2G and G2H buffers for req/resp, why do
>> we need
>> this additional mechanism to let GuC write fence/status into descriptor ?
>
>IPTCA
>
>>
>>> + /* ID associated with this buffer (assigned by GuC master) */
>>> + u32 client_id;
>>> + /* Used & set by the client for further tracking of internal
>>> clients */
>>> + u32 client_sub_tracking_id;
>>> + u32 reserved[5];
>>> +};
>>> +
>>> +/* Per client command transport buffer allocated by GuC master */
>>> +struct guc_master_cmd_transport_buffer_alloc {
>>> + /* This is the copy that GuC trusts */
>>> + struct guc_cmd_transport_buffer_desc buffer_desc;
>>> + u32 future;
>>> + u64 reserved0;
>>
>> please keep future/reserved fields together at the end of struct
>
>I'll pass a general comment about this
>>
>>> + u32 usage_special_info;
>>
>> what's the format of this ?
>
>No idea. IPTCA
>
>>
>>> + u32 valid;
>>
>> 32b for single flag ?
>
>IPTCA
>
>>
>>> + u32 associated_g2h_index;
>>> + u32 reserved1;
>>> +};
>>> +
>>> +/* Host 2 GuC Work Item
>>> + *
>>> V-----------------------------------------------------------------------V
>>> + *
>>>
>******************************************************************
>*******
>>> + * * * DW0/ * * * *
>>> + * * H2G Item Header * ReturnDW * DW1 * ... * DWn *
>>> + *
>>>
>******************************************************************
>*******
>>> + */
>>> +
>>> +/* Command buffer header */
>>> +struct guc_cmd_buffer_item_header {
>>> + union {
>>> + struct {
>>> + /* Number of dwords that are parameters of this
>>> + * Host2GuC action. Max of 31. E.g.: if there are 2 DWs
>>> + * following this header, this field is set to 2
>>> + */
>>> + u32 num_dwords : 5;
>>> +
>>> + u32 : 3;
>>> +
>>> + /* The uKernel will write the value from DW0 (aka
>>> + * ReturnDW) to fence_report_dw in struct
>>> + * guc_cmd_transport_buffer_desc
>>> + */
>>> + u32 write_fence_from_dw0_to_descriptor : 1;
>>> +
>>> + /* Write the status of the action to DW0 following this
>>> + * header
>>> + */
>>> + u32 write_status_to_dw0 : 1;
>>> +
>>> + /* Send a GuC2Host with Status of the action and the
>>> + * fence ID found in DW0 via the buffer used for GuC to
>>> + * Host communication
>>> + */
>>> + u32 send_status_with_dw0_via_guc_to_host : 1;
>>> +
>>> + u32 : 5;
>>> +
>>> + /* This is the value of the enum guc_host2guc_action
>>> + * that needs to be done by the uKernel
>>> + */
>>> + u32 host2guc_action : 16;
>>> + } h2g_cmd_buffer_item_header;
>>> +
>>> + struct {
>>> + /* Number of dwords that are parameters of this GuC2Host
>>> + * action
>>> + */
>>> + u32 num_dwords : 5;
>>> +
>>> + u32 : 3;
>>> +
>>> + /* Indicates that this GuC2Host action is a response of
>>> + * a Host2Guc request
>>> + */
>>> + u32 host2_guc_response : 1;
>>> +
>>> + u32 reserved : 7;
>>> +
>>> + /* struct guc_to_host_message */
>>> + u32 guc2host_action : 16;
>>> + } g2h_cmd_buffer_item_header;
>>> +
>>> + struct {
>>> + u32 num_dwords : 5;
>>> + u32 reserved : 3;
>>> + u32 free_for_client_use : 24;
>>> + } generic_cmd_buffer_item_header;
>>> +
>>> + u32 header_value;
>>> + };
>>
>> hmm, there was a long discussion about unifying CTB H2G and G2H messages,
>> and final proposal/agreement was to use:
>>
>> struct guc_ctb_header {
>> u32 num_dwords : 8;
>> u32 flags : 8;
>> u32 action_code : 16;
>> };
>> and
>> #define CTB_FLAGS_NONE 0
>> #define CTB_FLAG_FENCE_PRESENT 1
>> and
>> #define GUC_ACTION_RESPONSE GUC_ACTION_DEFAULT
>>
>> then
>> to send H2G request:
>> .action = ACTION (!= RESPONSE)
>> .flags |= FENCE_PRESENT
>> to send H2G notification:
>> .action = ACTION (!= RESPONSE)
>> .flags = FLAGS_NONE
>> to send G2H notification:
>> .action = ACTION (!= RESPONSE)
>> .flags = FLAGS_NONE
>> to send G2H response:
>> .action = ACTION_RESPONSE
>> .flags |= FENCE_PRESENT
>>
>
>IPTCA
>
>>> +
>>> +};
>>> +
>>> +struct guc_to_host_message {
>>> + union {
>>> + struct {
>>> + u32 uk_init_done : 1;
>>> + u32 crash_dump_posted : 1;
>>> + u32 : 1;
>>> + u32 flush_log_buffer_to_file : 1;
>>> + u32 preempt_request_old_preempt_pending : 1;
>>> + u32 preempt_request_target_context_bad : 1;
>>> + u32 : 1;
>>> + u32 sleep_entry_in_progress : 1;
>>> + u32 guc_in_debug_halt : 1;
>>> + u32 guc_report_engine_reset_context_id : 1;
>>> + u32 : 1;
>>> + u32 host_preemption_complete : 1;
>>> + u32 reserved0 : 4;
>>> + u32 gpa_to_hpa_xlation_error : 1;
>>> + u32 doorbell_id_allocation_error : 1;
>>> + u32 doorbell_id_allocation_invalid_ctx_id : 1;
>>> + u32 : 1;
>>> + u32 force_wake_timed_out : 1;
>>> + u32 force_wake_time_out_counter : 2;
>>> + u32 : 1;
>>> + u32 iommu_cat_page_faulted : 1;
>>> + u32 host2guc_engine_reset_complete : 1;
>>> + u32 reserved1 : 2;
>>> + u32 doorbell_selection_error : 1;
>>> + u32 doorbell_id_release_error : 1;
>>> + u32 uk_exception : 1;
>>> + u32 : 1;
>>> + };
>>> + u32 dw;
>>> + };
>>
>> this error bitmap definition is more applicable to MMIO based
>> notification, for CTB we could introduce something more flexible
>>
>> btw, even for MMIO based notification we can reuse:
>>
>> union guc_msg_header {
>> struct {
>> u32 code : 16;
>> u32 data : 12;
>> u32 type : 4; /* GUC_MSG_TYPE */
>> };
>> u32 value;
>> };
>>
>> in SCRATCH(15) with new:
>>
>> type = GUC_MSG_TYPE_NOTIF = 0x1,
>> type = GUC_MSG_TYPE_ERROR = 0xE,
>>
>> where code/data can hold extra details, for NOTIF:
>> uk_init_done = 1,
>> crash_dump_posted,
>> preempt_request_old_preempt_pending,
>> host_preemption_complete,
>> for ERROR:
>> preempt_request_target_context_bad,
>> doorbell_selection_error,
>> btw, some of these errors seem to be action specific,
>> so maybe they should be reported back in STATUS ?
>
>IPTCA
>
>>
>>> +
>>> +};
>>> +
>>> +/* Size of the buffer to save GuC's state before S3. The ddress of
>>> the buffer
>>> + * goes in guc_additional_data_structs
>>> + */
>>> +#define GUC_MAX_GUC_S3_SAVE_SPACE_PAGES 10
>>> +
>>> +/* MMIO Offset for status of sleep state enter request */
>>> +#define GUC_SLEEP_STATE_ENTER_STATUS 0xC1B8
>>
>> hmm, we used SCRATCH(14) for H2G, good to know that it will be used
>> for G2H
>>
>>> +
>>> +/* Status of sleep request. Value updated in
>>> GUC_SLEEP_STATE_ENTER_STATUS */
>>> +enum guc_sleep_state_enter_status {
>>> + GUC_SLEEP_STATE_ENTER_SUCCESS = 1,
>>> + GUC_SLEEP_STATE_ENTER_PREEMPT_TO_IDLE_FAILED = 2,
>>> + GUC_SLEEP_STATE_ENTER_ENG_RESET_FAILED = 3,
>>> +};
>>> +
>>> +
>>> +/* Enum to determine what mode the scheduler is in */
>>> +enum guc_scheduler_mode {
>>> + GUC_SCHEDULER_MODE_NORMAL,
>>> + GUC_SCHEDULER_MODE_STALL_IMMEDIATE,
>>> +};
>>> +
>>>
>+/****************************************************************
>*************
>>>
>>> + ********************************** Logging
>>> **********************************
>>> +
>>>
>******************************************************************
>***********/
>>> +
>>
>> maybe log interface can be defined in separate file ?
>
>IPTCA
>
>>
>>> +enum guc_log_buffer_type {
>>> + GUC_LOG_BUFFER_TYPE_ISR = 0x0,
>>> + GUC_LOG_BUFFER_TYPE_DPC = 0x1,
>>> + GUC_LOG_BUFFER_TYPE_CRASH = 0x2,
>>> + GUC_LOG_BUFFER_TYPE_MAX = 0x3,
>>> +};
>>> +
>>> +enum guc_log_verbosity {
>>> + GUC_LOG_VERBOSITY_LOW = 0x0,
>>> + GUC_LOG_VERBOSITY_MED = 0x1,
>>> + GUC_LOG_VERBOSITY_HIGH = 0x2,
>>> + GUC_LOG_VERBOSITY_ULTRA = 0x3,
>>> + GUC_LOG_VERBOSITY_MAX = 0x4,
>>> +};
>>> +
>>> +/* This enum controls the type of logging output. Can be changed
>>> dynamically
>>> + * using GUC_HOST2GUC_ACTION_LOG_VERBOSITY_LOGOUTPUT_SELECT
>>> + */
>>> +enum guc_logoutput_selection {
>>> + GUC_LOGOUTPUT_LOGBUFFER_ONLY = 0x0,
>>> + GUC_LOGOUTPUT_NPK_ONLY = 0x1,
>>> + GUC_LOGOUTPUT_LOGBUFFER_AND_NPK = 0x2,
>>> + GUC_LOGOUTPUT_MAX
>>> +};
>>> +
>>> +/* Filled by KMD except version and marker that are initialized by
>>> uKernel */
>>> +struct guc_km_log_buffer_state {
>>> + /* Marks the beginning of Buffer Flush (set by uKernel at Log
>>> Buffer
>>> + * Init)
>>> + */
>>> + u32 marker[2];
>>> +
>>> + /* This is the last byte offset location that was read by KMD.
>>> KMD will
>>> + * write to this and uKernel will read it
>>> + */
>>> + u32 log_buf_rd_ptr;
>>> +
>>> + /* This is the byte offset location that will be written by
>>> uKernel */
>>> + u32 log_buf_wr_ptr;
>>> +
>>> + u32 log_buf_size;
>>> +
>>> + /* This is written by uKernel when it sees the log buffer
>>> becoming half
>>> + * full. KMD writes this value in the log file to avoid stale data
>>> + */
>>> + u32 sampled_log_buf_wrptr;
>>> +
>>> + union {
>>> + struct {
>>> + /* uKernel sets this when log buffer is half full or
>>> + * when a forced flush has been requested through
>>> + * Host2Guc. uKernel will send Guc2Host only if this
>>> + * bit is cleared. This is to avoid unnecessary
>>> + * interrupts from GuC
>>> + */
>>> + u32 log_buf_flush_to_file : 1;
>>> +
>>> + /* uKernel increments this when log buffer overflows */
>>> + u32 buffer_full_count : 4;
>>> +
>>> + u32 : 27;
>>> + };
>>> + u32 log_buf_flags;
>>> + };
>>> +
>>> + u32 version;
>>> +};
>>> +
>>> +/* Logging Parameters sent via struct sched_control_data. Maintained
>>> as separate
>>> + * structure to allow debug tools to access logs without contacting
>>> GuC (for
>>> + * when GuC is stuck in ISR)
>>> + */
>>> +struct guc_log_init_params {
>>> + union {
>>> + struct {
>>> + u32 is_log_buffer_valid : 1;
>>> + /* Raise GuC2Host interrupt when buffer is half full */
>>> + u32 notify_on_log_half_full : 1;
>>> + u32 : 1;
>>> + /* 0 = Pages, 1 = Megabytes */
>>> + u32 allocated_count_units : 1;
>>> + /* No. of units allocated -1 (MAX 4 Units) */
>>> + u32 crash_dump_log_allocated_count : 2;
>>> + /* No. of units allocated -1 (MAX 8 Units) */
>>> + u32 dpc_log_allocated_count : 3;
>>> + /* No. of units allocated -1 (MAX 8 Units) */
>>> + u32 isr_log_allocated_count : 3;
>>> + /* Page aligned address for log buffer */
>>> + u32 log_buffer_gfx_address : 20;
>>> + };
>>> + u32 log_dword_value;
>>> + };
>>> +};
>>> +
>>> +/* Pass info for doing a Host2GuC request
>>> (GUC_HOST2GUC_ACTION_ENABLE_LOGGING)
>>> + * in order to enable/disable GuC logging
>>> + */
>>> +struct guc_log_enable_params {
>>> + union {
>>> + struct {
>>> + u32 logging_enabled : 1;
>>> + u32 profile_logging_enabled : 1;
>>> + u32 log_output_selection : 2;
>>> + u32 log_verbosity : 4;
>>> + u32 default_logging_enabled : 1;
>>> + u32 : 23;
>>> + };
>>> + u32 log_enable_dword_value;
>>> + };
>>> +
>>> +};
>>> +
>>>
>+/****************************************************************
>*************
>>>
>>> + ************** Sched Control Data and Addtional Data Structures
>>> *************
>>> +
>>>
>******************************************************************
>***********/
>>> +
>>> +/* Holds the init values of various parameters used by the uKernel */
>>> +struct guc_sched_control_data {
>>
>> is this good name? I can see log related params below...
>
>Agreed, the name choice is terrible. IPTCA.
>
>>
>>> + /* Dword 0 */
>>> + union {
>>> + struct {
>>> + /* Num of contexts in pool in blocks of 16,
>>> + * E.g.: num_contexts_in_pool16_blocks = 1 if 16
>>> + * contexts, 64 if 1024 contexts allocated
>>> + */
>>> + u32 num_contexts_in_pool16_blocks : 12;
>>> +
>>> + /* Aligned bits [31:12] of the GFX address where the
>>> + * pool begins
>>> + */
>>> + u32 context_pool_gfx_address_begin : 20;
>>> + };
>>> + };
>>> +
>>> + /* Dword 1 */
>>> + struct guc_log_init_params log_init_params;
>>
>> can we keep related data together ? dw4 also has log params
>
>I already made this same comment in the past. It simply hasn't been
>incorporated yet.
>>
>>> +
>>> +
>>> + /* Dword 2 */
>>> + union {
>>> + struct {
>>> + u32 reserved : 1;
>>> + u32 wa_disable_dummy_all_engine_fault_fix : 1;
>>> + u32 : 30;
>>> + };
>>> + u32 workaround_dw;
>>> + };
>>> +
>>> + /* Dword 3 */
>>> + union {
>>> + struct {
>>> + u32 ftr_enable_preemption_data_logging : 1;
>>> + u32 ftr_enable_guc_pavp_control : 1;
>>> + u32 ftr_enable_guc_slpm : 1;
>>> + u32 ftr_enable_engine_reset_on_preempt_failure : 1;
>>> + u32 ftr_lite_restore : 1;
>>> + u32 ftr_driver_flr : 1;
>>> + u32 future : 1;
>>> + u32 ftr_enable_psmi_logging : 1;
>>> + u32 : 1;
>>> + u32 : 1;
>>> + u32 : 1;
>>> + u32 : 1;
>>> + u32 : 1;
>>> + u32 : 1;
>>> + u32 : 18;
>>> + };
>>> + u32 feature_dword;
>>> + };
>>> +
>>> + /* Dword 4 */
>>> + union {
>>> + struct {
>>> + /* One of enum guc_log_verbosity */
>>> + u32 logging_verbosity : 4;
>>> + /* One of enum guc_logoutput_selection */
>>> + u32 log_output_selection : 2;
>>> + u32 logging_disabled : 1;
>>> + u32 profile_logging_enabled : 1;
>>> + u32 : 24;
>>> + };
>>> + };
>>> +
>>> + /* Dword 5 */
>>> + union {
>>> + struct {
>>> + u32 : 1;
>>> + u32 gfx_address_additional_data_structs : 21;
>>> + u32 : 10;
>>> + };
>>> + };
>>> +
>>> +};
>>> +
>>> +/* Structure to pass additional information and structure pointers
>>> to */
>>> +struct guc_additional_data_structs {
>>> + /* Gfx ptr to struct guc_mmio_save_restore_list (persistent) */
>>> + u32 gfx_address_mmio_save_restore_list;
>>> +
>>> + /* Buffer of size GUC_MAX_GUC_S3_SAVE_SPACE_PAGES (persistent) */
>>> + u32 gfx_ptr_to_gucs_state_save_buffer;
>>> +
>>> + /* Gfx addresses of struct guc_scheduling_policies
>>> (non-persistent, may
>>> + * be released after initial load), NULL or valid = 0 flag value
>>> will
>>> + * cause default policies to be loaded
>>> + */
>>> + u32 gfx_scheduler_policies;
>>> +
>>> + /* Gfx address of struct guc_gt_system_info */
>>> + u32 gt_system_info;
>>> +
>>> + u32 future;
>>> +
>>> + u32 gfx_ptr_to_psmi_log_control_data;
>>> +
>>> + /* LRCA addresses and sizes of golden contexts (persistent) */
>>> + u32 gfx_golden_context_lrca[GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
>>> + u32
>>>
>golden_context_eng_state_size_in_bytes[GUC_MAX_SCHEDULABLE_ENGINE_CL
>ASS];
>>
>> maybe this is could be more future friendly if we define it as the last
>> member:
>>
>> struct {
>> u32 lrca_addrs;
>> u32 eng_state_size_in_bytes;
>> } gfx_golden_context[GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
>
>IPTCA
>
>>
>>> +
>>> + u32 reserved[16];
>>> +};
>>> +
>>> +/* Max number of mmio per engine class per engine instance */
>>> +#define GUC_MAX_MMIO_PER_SET 64
>>> +
>>> +struct guc_mmio_flags {
>>> + union {
>>> + struct {
>>> + u32 masked : 1;
>>> + u32 : 31;
>>> + };
>>> + u32 flags_value;
>>> + };
>>> +};
>>> +
>>> +struct guc_mmio {
>>> + u32 offset;
>>> + u32 value;
>>> + struct guc_mmio_flags flags;
>>> +};
>>> +
>>> +struct guc_mmio_set {
>>> + /* Array of mmio to be saved/restored */
>>> + struct guc_mmio mmio[GUC_MAX_MMIO_PER_SET];
>>> + /* Set after saving mmio value, cleared after restore. */
>>> + u32 mmio_values_valid;
>>> + /* Number of mmio in the set */
>>> + u32 number_of_mmio;
>>> +};
>>> +
>>> +struct guc_mmio_save_restore_list {
>>> + struct guc_mmio_set
>>> + node_mmio_set[GUC_MAX_SCHEDULABLE_ENGINE_CLASS]
>>> + [GUC_MAX_ENGINE_INSTANCE_PER_CLASS];
>>> + u32 reserved[98];
>>> +};
>>> +
>>> +/* Policy flags to control scheduling decisions */
>>> +struct guc_scheduling_policy_flags {
>>> + union {
>>> + struct {
>>> + /* Should we reset engine when preemption failed within
>>> + * its time quantum
>>> + */
>>> + u32 reset_engine_upon_preempt_failure : 1;
>>> +
>>> + /* Should we preempt to idle unconditionally for the
>>> + * execution quantum expiry
>>> + */
>>> + u32 preempt_to_idle_on_quantum_expiry : 1;
>>> +
>>> + u32 : 30;
>>> + };
>>> + u32 policy_dword;
>>> + };
>>> +};
>>> +
>>> +/* Per-engine class and per-priority struct for scheduling policy */
>>> +struct guc_scheduling_policy {
>>> + /* Time for one workload to execute (micro seconds) */
>>> + u32 execution_quantum;
>>> +
>>> + /* Time to wait for a preemption request to completed before
>>> issuing a
>>> + * reset (micro seconds)
>>> + */
>>> + u32 wait_for_preemption_completion_time;
>>> +
>>> + /* How much time to allow to run after the first fault is observed.
>>> + * Then preempt afterwards (micro seconds)
>>> + */
>>> + u32 quantum_upon_first_fault_time;
>>> +
>>> + struct guc_scheduling_policy_flags policy_flags;
>>> +
>>> + u32 reserved[8];
>>> +};
>>> +
>>> +/* KMD should populate this struct and pass info through struct
>>> + * guc_additional_data_structs- If KMD does not set the scheduler
>>> policy,
>>> + * uKernel will fall back to default scheduling policies
>>> + */
>>> +struct guc_scheduling_policies {
>>> + struct guc_scheduling_policy
>>> +
>per_submit_queue_policy[GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT]
>>> + [GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
>>> +
>>> + /* Submission queue depth, min 2, max 8. If outside the valid
>>> range,
>>> + * default value is used
>>> + */
>>> + u32 submission_queue_depth[GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
>>
>> as it looks that most of the arrays are indexed by
>> GUC_MAX_SCHEDULABLE_ENGINE_CLASS then maybe all data structures
>> should be designed around engine like:
>>
>> struct guc_engine {
>> u8 class;
>> u8 instance;
>>
>> u32 submission_queue_depth;
>> struct guc_scheduling_policy
>> policy[GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT];
>> };
>
>IPTCA
>
>>> +
>>> + /* How much time to allow before DPC processing is called back via
>>> + * interrupt (to prevent DPC queue drain starving) IN micro
>>> seconds.
>>> + * Typically in the 1000s (example only, not granularity)
>>> + */
>>> + u32 dpc_promote_time;
>>> +
>>> + /* Must be set to take these new values */
>>> + u32 is_valid;
>>> +
>>> + /* Number of WIs to process per call to process single. Process
>>> single
>>> + * could have a large Max Tail value which may keep CS idle.
>>> Process
>>> + * max_num_work_items_per_dpc_call WIs and try fast schedule
>>> + */
>>> + u32 max_num_work_items_per_dpc_call;
>>> +
>>> + u32 reserved[4];
>>> +};
>>> +
>>> +#pragma pack()
>>> +
>>> +#endif
>>
>>
>> Thanks,
>> Michal
More information about the Intel-gfx
mailing list