[Intel-gfx] [RFC PATCH] drm/i915/guc: New interface files for GuC starting in Gen11

Thu Jun 21 15:13:30 UTC 2018

On Thu, Jun 14, 2018 at 10:48:01PM +0000, Srivatsa, Anusha wrote:

Overall I don't see any biggest issue with this file and I agree with
most of Michal's comments, specially with the nameless bitfields.

Well, maybe the nameless is a big issue if it breaks compilers out there.

But I'm responding here to some comments and questions that was pending
and hopefully put this thread back to top of people's inboxes ;)

more inline below:

> 
> 
> >-----Original Message-----
> >From: Mateo Lozano, Oscar
> >Sent: Wednesday, June 13, 2018 3:08 PM
> >To: Wajdeczko, Michal <Michal.Wajdeczko at intel.com>; intel-
> >gfx at lists.freedesktop.org
> >Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>; Rogovin, Kevin
> ><kevin.rogovin at intel.com>; Spotswood, John A <john.a.spotswood at intel.com>;
> >Srivatsa, Anusha <anusha.srivatsa at intel.com>; Ceraolo Spurio, Daniele
> ><daniele.ceraolospurio at intel.com>; Thierry, Michel <michel.thierry at intel.com>;
> >Chris Wilson <chris at chris-wilson.co.uk>; Winiarski, Michal
> ><michal.winiarski at intel.com>; Lis, Tomasz <tomasz.lis at intel.com>; Ewins, Jon
> ><jon.ewins at intel.com>; Sundaresan, Sujaritha
> ><sujaritha.sundaresan at intel.com>; Patel, Jalpa <jalpa.patel at intel.com>; Li,
> >Yaodong <yaodong.li at intel.com>
> >Subject: Re: [RFC PATCH] drm/i915/guc: New interface files for GuC starting in
> >Gen11
> >
> >
> >
> >On 5/29/2018 7:59 AM, Michal Wajdeczko wrote:
> >> Hi,
> >>
> >> On Fri, 25 May 2018 23:59:35 +0200, Oscar Mateo <oscar.mateo at intel.com>
> >> wrote:
> >>
> >>> GuC interface has been redesigned (or cleaned up, rather) starting
> >>> with Gen11, as a stepping stone towards a new branching strategy
> >>> that helps maintain backwards compatibility with previous Gens, as
> >>> well as sideward compatibility with other OSes.
> >>>
> >>> The interface is split in two header files: one for the KMD and one
> >>> for clients of the GuC (which, in our case, happens to be the KMD
> >>> as well). SLPC interface files will come at a later date.
> >>>
> >>> Could we get eyes on the new interface header files, to make sure the
> >>> GuC team is moving in the right direction?
> >>>
> >>> Signed-off-by: Oscar Mateo <oscar.mateo at intel.com>
> >>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> >>> Cc: Kevin Rogovin <kevin.rogovin at intel.com>
> >>> Cc: John A Spotswood <john.a.spotswood at intel.com>
> >>> Cc: Anusha Srivatsa <anusha.srivatsa at intel.com>
> >>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> >>> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
> >>> Cc: Michel Thierry <michel.thierry at intel.com>
> >>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Cc: Michał Winiarski <michal.winiarski at intel.com>
> >>> Cc: Tomasz Lis <tomasz.lis at intel.com>
> >>> Cc: Jon Ewins <jon.ewins at intel.com>
> >>> Cc: Sujaritha Sundaresan <sujaritha.sundaresan at intel.com>
> >>> Cc: Jalpa Patel <jalpa.patel at intel.com>
> >>> Cc: Jackie Li <yaodong.li at intel.com>
> >>> ---
> >>>  drivers/gpu/drm/i915/intel_guc_client_interface.h | 255 +++++++
> >>>  drivers/gpu/drm/i915/intel_guc_kmd_interface.h    | 847
> >>> ++++++++++++++++++++++
> >>>  2 files changed, 1102 insertions(+)
> >>>  create mode 100644 drivers/gpu/drm/i915/intel_guc_client_interface.h
> >>>  create mode 100644 drivers/gpu/drm/i915/intel_guc_kmd_interface.h
> >>
> >> can we name these files as:
> >>
> >>     drivers/gpu/drm/i915/intel_guc_interface.h
> >>     drivers/gpu/drm/i915/intel_guc_interface_client.h
> >> or
> >>     drivers/gpu/drm/i915/intel_guc_defs.h
> >>     drivers/gpu/drm/i915/intel_guc_defs_client.h
> >> or
> >>     drivers/gpu/drm/i915/guc/guc.h
> >>     drivers/gpu/drm/i915/guc/guc_client.h
> >
> >I'm fine with any of these names.
> >
> >>
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/intel_guc_client_interface.h
> >>> b/drivers/gpu/drm/i915/intel_guc_client_interface.h
> >>> new file mode 100644
> >>> index 0000000..1ef91a7
> >>> --- /dev/null
> >>> +++ b/drivers/gpu/drm/i915/intel_guc_client_interface.h
> >>> @@ -0,0 +1,255 @@
> >>> +/*
> >>> + * SPDX-License-Identifier: MIT
> >>> + *
> >>> + * Copyright © 2018 Intel Corporation
> >>> + */
> >>> +
> >>> +#ifndef _INTEL_GUC_CLIENT_INTERFACE_H_
> >>> +#define _INTEL_GUC_CLIENT_INTERFACE_H_
> >>> +
> >>
> >> need to include types.h for u32
> >
> >Will do.
> >
> >>
> >>> +#pragma pack(1)
> >>> +
> >>>
> >+/****************************************************************
> >*************
> >>>
> >>> + ********************************** Engines
> >>> **********************************
> >>> +
> >>>
> >******************************************************************
> >***********/
> >>
> >> no fancy markups, please
> >>
> >
> >Ok.
> >
> >>> +
> >>> +#define GUC_MAX_ENGINE_INSTANCE_PER_CLASS    4
> >>> +#define GUC_MAX_SCHEDULABLE_ENGINE_CLASS    5
> >>> +#define GUC_MAX_ENGINE_CLASS_COUNT        6
> >>> +#define GUC_ENGINE_INVALID            6
> >>
> >> hmm, why not 7 or 127 ?
> >> maybe if we need value for INVALID we should use 0 or -1 (~0)
> >
> >I'll pass this comment to the GuC team.

Any response from them on this and other raised items?

> >
> >>
> >>> +
> >>> +/* Engine Class that uKernel can schedule on. This is just a SW
> >>> enumeration.
> >>> + * HW configuration will depend on the Platform and SKU
> >>> + */
> >>> +enum uk_engine_class {
> >>
> >> why there is new prefix "uk" ?
> >
> >uk stands for uKernel. In this case, I'm guessing it is used to
> >differentiate between the engine class defined by hardware vs. the one
> >defined by the uKernel.
> >>
> >>> +    UK_RENDER_ENGINE_CLASS = 0,
> >>> +    UK_VDECENC_ENGINE_CLASS = 1,
> >>> +    UK_VE_ENGINE_CLASS = 2,
> >>> +    UK_BLT_COPY_ENGINE_CLASS = 3,
> >>> +    UK_RESERVED_ENGINE_CLASS = 4,
> >>> +    UK_OTHER_ENGINE_CLASS = 5,
> >>
> >> either use valid names or drop RESERVED/OTHER as values
> >>   from 0 to GUC_MAX_ENGINE_CLASS_COUNT are 'reserved' by
> >> definition unless explicitly defined ;)
> >
> >I'll drop them.
> >
> >>
> >>> +};
> >>
> >> as we don't use enum in binary struct definitions, then maybe we
> >> should define all engine classes with #define as:
> >>
> >>     #define ENGINE_CLASS_INVALID  0
> >>     #define ENGINE_CLASS_ALL      0
> >>     #define ENGINE_CLASS_RENDER   1
> >>     #define ENGINE_CLASS_VDECENC  2
> >>     ...
> >>     #define ENGINE_CLASS_MAX      7
> >>
> >
> >I can pass this comment to the GuC team. Or we can use defines for the
> >Linux header files, but then we might start deviating again from a
> >common interface naming.

yeap, better to keep in sync... indeed...

> >
> >> what if future HW will support more than 7 engine classes
> >> or they will be so different that they deserve separate id?
> >> why
> >
> >I imagine that's what the reserved is for. I cannot think of many more
> >engine classes, but I probably lack imagination.
> >
> >>
> >>> +
> >>> +/* Engine Instance that uKernel can schedule on */
> >>> +enum uk_engine_instance {
> >>> +    UK_ENGINE_INSTANCE_0 = 0,
> >>> +    UK_ENGINE_INSTANCE_1 = 1,
> >>> +    UK_ENGINE_INSTANCE_2 = 2,
> >>> +    UK_ENGINE_INSTANCE_3 = 3,
> >>> +    UK_INVALID_ENGINE_INSTANCE =
> >GUC_MAX_ENGINE_INSTANCE_PER_CLASS,
> >>> +    UK_ENGINE_ALL_INSTANCES = UK_INVALID_ENGINE_INSTANCE
> >>> +};
> >>
> >> I'm not sure why we would need this enum as we already have
> >> GUC_MAX_ENGINE_INSTANCE_PER_CLASS and can easily identify
> >> instance as [0 ... GUC_MAX_ENGINE_INSTANCE_PER_CLASS), or
> >> maybe more intuitive would be use normal indexing and use 0
> >> to indicate INVALID/AUTO/ALL instance ?
> >>
> >>     #define ENGINE_INSTANCE_INVALID  0
> >>     #define ENGINE_INSTANCE_ALL      0
> >>     #define ENGINE_INSTANCE_MAX      4
> >>
> >
> >I can pass this comment along.
> >
> >>> +
> >>> +/* Target Engine field used in WI header and Guc2Host */
> >>> +struct guc_target_engine {
> >>> +    union {
> >>> +        struct {
> >>> +            /* One of enum uk_engine_class */
> >>> +            u8 engine_class : 3;
> >>
> >> enum defines only 0..5 while in these 3 bits we can pass 0..7
> >>
> >
> >So? You can pass 6 and 7 if you want, but it would be invalid (as the
> >enum shows).
> >
> >>> +            /* One of enum uk_engine_instance */
> >>> +            u8 engine_instance : 4;
> >>
> >> again, mismatch between 'enum' and bitfields.
> >
> >Again, I don't understand the problem.
> >
> >>
> >>> +            /* All enabled engine classes and instances */
> >>> +            u8 all_engines : 1;
> >>
> >> inconsistency as there is dedicated value UK_ENGINE_ALL_INSTANCES
> >> for engine_instance but for engine_class there is separate bit.
> >
> >I'll pass this comment along.
> >
> >>
> >>> +        };
> >>> +        u8 value;
> >>> +    };
> >>> +};
> >>> +
> >>> +struct guc_engine_class_bit_map {
> >>> +    union {
> >>> +        /* Bit positions must match enum uk_engine_class value */
> >>> +        struct {
> >>> +            u32 render_engine_class : 1;
> >>> +            u32 vdecenc_engine_class : 1;
> >>> +            u32 ve_engine_class : 1;
> >>> +            u32 blt_copy_engine_class : 1;
> >>> +            u32 reserved_engine_class : 1;
> >>> +            u32 other_engine_class : 1;
> >>> +            u32 : 26;
> >>
> >> btw, iirc nameless bit fields may not correctly initialized on some
> >> compilers, so better to avoid them.
> >> also, do we want to use bitfields or stay with _SHIFT/_MASK macros?
> >
> >At some point we agreed that we were trying to keep the header files
> >from different OSes as close as possible.
> >IIRC, I raised the point that _SHIFT/_MASK macros are the preferred
> >approach in i915, and I was told this is an exceptional circumstance.

But even with the bitfields, the actual problem here are the nameless ones
not just this one here, but there are more below.

Could we at least add a "rsvd" name or something like that?

> >
> >>
> >>> +        };
> >>> +        u32 value;
> >>> +    };
> >>> +};
> >>> +
> >>> +struct guc_engine_instance_bit_map {
> >>> +    union {
> >>> +        struct {
> >>> +            u32 engine0 : 1;
> >>> +            u32 engine1 : 1;
> >>> +            u32 engine2 : 1;
> >>> +            u32 engine3 : 1;
> >>> +            u32 engine4 : 1;
> >>> +            u32 engine5 : 1;
> >>> +            u32 engine6 : 1;
> >>> +            u32 engine7 : 1;
> >>
> >> "engine" ? maybe they mean "instance" ?
> >> and if instance, then enum above defines only 0..3
> >>
> >
> >I'll pass the comment along.
> >
> >>> +            u32 : 24;
> >>> +        };
> >>> +        u32 value;
> >>> +    };
> >>> +};
> >>> +
> >>> +struct guc_engine_bit_map {
> >>> +    struct guc_engine_class_bit_map engine_class_bit_map;
> >>> +    struct guc_engine_instance_bit_map
> >>> +        engine_instance_bit_map[GUC_MAX_ENGINE_CLASS_COUNT];
> >>> +};
> >>> +
> >>>
> >+/****************************************************************
> >*************
> >>>
> >>> + ********************* Process Descriptor and Work Queue
> >>> *********************
> >>> +
> >>>
> >******************************************************************
> >***********/
> >>> +
> >>> +/* Status of a Work Queue */
> >>> +enum guc_queue_status {
> >>> +    GUC_QUEUE_STATUS_ACTIVE = 1,
> >>> +    GUC_QUEUE_STATUS_SUSPENDED = 2,
> >>> +    GUC_QUEUE_STATUS_CMD_ERROR = 3,
> >>> +    GUC_QUEUE_STATUS_ENGINE_ID_NOT_USED = 4,
> >>> +    GUC_QUEUE_STATUS_SUSPENDED_FROM_ENGINE_RESET = 5,
> >>> +    GUC_QUEUE_STATUS_INVALID_STATUS = 6,
> >>
> >> why not 0 ? what 0 means ?
> >
> >IPTCA (I'll pass the comment along)
> >
> >>
> >>> +};
> >>> +
> >>> +/* Priority of guc_context_descriptor */
> >>> +enum guc_context_priority {
> >>> +    GUC_CONTEXT_PRIORITY_KMD_HIGH = 0,
> >>> +    GUC_CONTEXT_PRIORITY_HIGH = 1,
> >>> +    GUC_CONTEXT_PRIORITY_KMD_NORMAL = 2,
> >>> +    GUC_CONTEXT_PRIORITY_NORMAL = 3,
> >>> +    GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT = 4,
> >>> +    GUC_CONTEXT_PRIORITY_INVALID =
> >>> GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT
> 	Is this correct?  priority of invalid context being same as GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT ?

yeap... it probably is right if handled correctly.

> 
> >>> +};
> 
> >>> +/* A shared structure between app and uKernel for communication */
> >>> +struct guc_sched_process_descriptor {
> >>> +    /* Index in the GuC Context Descriptor Pool */
> >>> +    u32 context_id;
> >>> +
> >>> +    /* Pointer to doorbell cacheline. BSpec: 1116 */
> >>> +    u64 p_doorbell;
> >>
> >> pointer ? maybe s/p_doorbell/doorbell_addr as this likely is phy addr
> >>
> >
> >IPTCA
> >
> >>> +
> >>> +    /* WQ Head Byte Offset - Client must not write here */
> >>> +    u32 head_offset;
> 	s/ head_offset/wq_head_offset?
> 
> >>> +    /* WQ Tail Byte Offset - uKernel will not write here */
> >>> +    u32 tail_offset;
> 	s/tail_offset/wq_tail_offset?

Because of the sync it is better to not touch and change the names here...
Whatever they decide if it doesn't conflict with anything else that
we already have in place.

> 
> >>> +    /* WQ Error Byte offset */
> >>> +    u32 error_offset_byte;
> 	Same here.
> 
> >>> +    /* WQ pVirt base address in Client. For use only by Client */
> >>> +    u64 wqv_base_address;
> >>
> >> client virtual address shall not be part of the GuC ABI
> >
> >Hmmm... if I understand this correctly, this is intended for Direct
> >Submission.
> >
> >>
> >>> +
> >>> +    /* WQ Size in Bytes */
> >>> +    u32 wq_size_bytes;
> >>> +
> >>> +    /* WQ Status. Read by Client. Written by uKernel/KMD */
> >>> +    enum guc_queue_status wq_status;
> >>
> >> we have many discussions about not using enums in packed structs
> >
> >Yes, and they were noted and tickets opened. They simply haven't
> >happened yet.
> >
> >>
> >>> +
> >>> +    /* Context priority. Read only by Client */
> >>> +    enum guc_context_priority priority_assigned;
> >>
> >> same here
> >
> >Same here.
> >
> >>
> >>> +
> >>> +    u32 future;
> 	Looks confusing. Is this to be used in future?

future, reserved, rsvd... whatever they decide is okay I believe... 

> 
> >> if we really need this, can it be merged with reserved0 ?
> >
> >It could, but this helps keep header files from different OSes as close
> >as possible.
> >
> >>
> >>> +
> >>> +    struct guc_engine_class_bit_map queue_engine_error;
> >>> +
> >>> +    u32 reserved0[3];
> >>> +
> >>> +    /* uKernel side tracking for debug */
> >>
> >> this should be separate struct definition
> >
> >IPTCA
> >
> >>
> >>> +
> >>> +    /* Written by uKernel at the time of parsing and successful removal
> >>> +     * from WQ (implies ring tail was updated)
> >>> +     */
> >>> +    u32 total_work_items_parsed_by_guc;
> >>> +
> >>> +    /* Written by uKernel if a WI was collapsed if next WI is the same
> >>> +     * LRCA (optimization applies only to Secure/KMD contexts)
> >>> +     */
> >>> +    u32 total_work_items_collapsed_by_guc;
> >>> +
> >>> +    /* Tells if the context is affected by Engine Reset. UMD needs to
> >>> +     * clear it after taking appropriate Action (TBD)
> >>> +     */
> >>> +    u32 is_context_in_engine_reset;
> >>> +
> >>> +    /* WQ Sampled tail at Engine Reset Time. Valid only if
> >>> +     * is_context_in_engine_reset = true
> >>> +     */
> >>> +    u32 engine_reset_sampled_wq_tail;
> >>> +
> >>> +    /* Valid from engine reset until all the affected Work Items are
> >>> +     * processed
> >>> +     */
> >>> +    u32 engine_reset_sampled_wq_tail_valid;
> >>> +
> >>> +    u32 reserved1[15];
> >>> +};
> >>> +
> >>> +/* Work Item for submitting KMD workloads into Work Queue for GuC */
> >>> +struct guc_sched_work_queue_kmd_element_info {
> >>> +    /* Execlist context descriptor's lower DW. BSpec: 12254 */
> >>> +    u32 element_low_dw;
> >>> +    union {
> >>> +        struct {
> >>> +            /* SW Context ID. BSpec: 12254 */
> >>> +            u32 sw_context_index : 11;
> >>> +            /* SW Counter. BSpec: 12254 */
> >>> +            u32 sw_context_counter : 6;
> >>> +            /* If this workload needs to be synced prior
> >>> +             * to submission use context_submit_sync_value and
> >>> +             * context_submit_sync_address
> >>> +             */
> >>> +            u32 needs_sync : 1;
> >>> +            /* QW Aligned, TailValue <= 2048
> >>> +             * (addresses 4 pages max)
> >>> +             */
> >>> +            u32 ring_tail_qw_index : 11;
> >>> +            u32 : 2;
> >>> +            /* Bit to indicate if POCS needs to be in FREEZE state
> >>> +             * for this WI submission
> >>> +             */
> >>> +            u32 wi_freeze_pocs : 1;
> >>> +        };
> >>> +        u32 value;
> >>> +    } element_high_dw;
> >>> +};
> >>> +
> >>> +/* Work Item instruction type */
> >>> +enum guc_sched_instruction_type {
> >>> +    GUC_SCHED_INSTRUCTION_BATCH_BUFFER_START = 0x1,
> >>> +    GUC_SCHED_INSTRUCTION_GUC_CMD_PSEUDO = 0x2,
> >>> +    GUC_SCHED_INSTRUCTION_GUC_CMD_KMD = 0x3,
> >>> +    GUC_SCHED_INSTRUCTION_GUC_CMD_NOOP = 0x4,
> >>> +    GUC_SCHED_INSTRUCTION_RESUME_ENGINE_WQ_PARSING = 0x5,
> >>> +    GUC_SCHED_INSTRUCTION_INVALID = 0x6,
> >>> +};
> >>> +
> >>> +/* Header for every Work Item put in the Work Queue */
> >>> +struct guc_sched_work_queue_item_header {
> >>> +    union {
> >>> +        struct {
> >>> +            /* One of enum guc_sched_instruction_type */
> >>> +            u32 work_instruction_type : 8;
> >>> +            /* struct guc_target_engine */
> >>> +            u32 target_engine : 8;
> >>> +            /* Length in number of dwords following this header */
> >>> +            u32 command_length_dwords : 11;
> >>> +            u32 : 5;
> >>> +        };
> >>> +        u32 value;
> >>> +    };
> >>> +};
> >>> +
> >>> +/* Work item for submitting KMD workloads into work queue for GuC */
> >>> +struct guc_sched_work_queue_item {
> >>> +    struct guc_sched_work_queue_item_header header;
> >>> +    struct guc_sched_work_queue_kmd_element_info
> >>> kmd_submit_element_info;
> >>> +    /* Debug only */
> >>> +    u32 fence_id;
> >>> +};
> >>> +
> >>> +struct km_gen11_resume_work_queue_processing_item {
> >>> +    struct guc_sched_work_queue_item header;
> >>> +};
> >>> +
> >>> +#pragma pack()
> >>> +
> >>> +#endif
> >>> diff --git a/drivers/gpu/drm/i915/intel_guc_kmd_interface.h
> >>> b/drivers/gpu/drm/i915/intel_guc_kmd_interface.h
> >>> new file mode 100644
> >>> index 0000000..4c643ad
> >>> --- /dev/null
> >>> +++ b/drivers/gpu/drm/i915/intel_guc_kmd_interface.h
> >>> @@ -0,0 +1,847 @@
> >>> +/*
> >>> + * SPDX-License-Identifier: MIT
> >>> + *
> >>> + * Copyright © 2018 Intel Corporation
> >>> + */
> >>> +
> >>> +#ifndef _INTEL_GUC_KMD_INTERFACE_H_
> >>> +#define _INTEL_GUC_KMD_INTERFACE_H_
> >>> +
> >>> +#include "intel_guc_client_interface.h"
> >>> +
> >>> +#pragma pack(1)
> >>> +
> >>> +/* Maximum number of entries in the GuC Context Descriptor Pool.
> >>> Upper limit
> >>> + * restricted by number of 'SW Context ID' bits in the Context
> >>> Descriptor
> >>> + * (BSpec: 12254) minus some reserved entries
> >>> + */
> >>> +#define GUC_MAX_GUC_CONTEXT_DESCRIPTOR_ENTRIES    2032
> >>> +
> >>> +/* Limited by 'SW Counter' bits. BSpec: 12254 */
> >>> +#define GUC_MAX_SW_CONTEXT_COUNTER    64
> >>> +
> >>> +/* Maximum depth of HW Execlist Submission Queue. BSpec: 18934 */
> >>> +#define GUC_MAX_SUBMISSION_Q_DEPTH    8
> >>> +
> >>> +/* Minimum depth of HW Execlist Submission Queue. BSpec: 18934 */
> >>> +#define GUC_MIN_SUBMISSION_Q_DEPTH    2
> >>> +
> >>> +/* Default depth of HW Execlist Submission Queue. BSpec: 18934 */
> >>> +#define GUC_DEFAULT_ELEM_IN_SUBMISSION_Q
> >GUC_MIN_SUBMISSION_Q_DEPTH
> >>> +
> >>> +/* 1 Cacheline = 64 Bytes */
> >>> +#define GUC_DMA_CACHELINE_SIZE_BYTES    64
> >>> +
> >>>
> >+/****************************************************************
> >*************
> >>>
> >>> + ************************** Engines and System Info
> >>> **************************
> >>> +
> >>>
> >******************************************************************
> >***********/
> >>> +
> >>> +/* GT system info passed down by KMD after reading fuse registers */
> >>> +struct guc_gt_system_info {
> >>> +    u32 slice_enabled;
> >>> +    u32 rcs_enabled;
> >>> +    u32 future0;
> >>> +    u32 bcs_enabled;
> >>> +    u32 vd_box_enable_mask;
> >>> +    u32 future1;
> >>> +    u32 ve_box_enable_mask;
> >>> +    u32 future2;
> >>> +    u32 reserved[8];
> >>
> >> what's the format of these u32 ?
> >> maybe guc_engine_bit_mask can be reused here ?
> >
> >Most of them are just booleans. Yes, guc_engine_bit_mask can probably be
> >reused. IPTCA
> >
> >>
> >>> +};
> >>> +
> >>>
> >+/****************************************************************
> >*************
> >>>
> >>> + ************************ GuC Context Descriptor Pool
> >>> ************************
> >>> +
> >>>
> >******************************************************************
> >***********/
> >>> +
> >>> +/* State of the context */
> >>> +struct guc_engine_context_state {
> >>> +    union {
> >>> +        struct {
> >>> +            u32 wait_for_display_event : 1;
> >>> +            u32 wait_for_semaphore : 1;
> >>> +            u32 re_enqueue_to_submit_queue : 1;
> >>> +            u32 : 29;
> >>> +        };
> >>> +        u32 wait_value;
> >>> +    };
> >>> +    u32 reserved;
> >>> +};
> >>> +
> >>> +/* To describe status and access information of current ring buffer for
> >>> + * a given guc_execlist_context
> >>> + */
> >>> +struct guc_execlist_ring_buffer {
> >>> +    u32 p_execlist_ring_context;
> >>> +
> >>> +    /* uKernel address for the ring buffer */
> >>> +    u32 p_ring_begin;
> >>> +    /* uKernel final byte address that is valid for this ring */
> >>> +    u32 p_ring_end;
> >>> +    /* uKernel address for next location in ring */
> >>> +    u32 p_next_free_location;
> >>
> >> hmm, uKernel private addresses are part of the ABI ?
> >
> >This simply means GGTT addresses (there is nothing really private to the
> >uKernel)
> >
> >>
> >>> +
> >>> +    /* Last value written by software for tracking (just in case
> >>> +     * HW corrupts the tail in its context)
> >>> +     */
> >>> +    u32 current_tail_pointer_value;
> >>> +};
> >>> +
> >>> +/* The entire execlist context including software and HW information */
> >>> +struct guc_execlist_context {
> >>> +    /* 2 DWs of Context Descriptor. BSpec: 12254 */
> >>> +    u32 hw_context_desc_dw[2];
> >>> +    u32 reserved0;
> >>> +
> >>> +    struct guc_execlist_ring_buffer ring_buffer_obj;
> >>> +    struct guc_engine_context_state state;
> >>> +
> >>> +    /* Flag to track if execlist context exists in submit queue
> >>> +     * Valid values 0 or 1
> >>> +     */
> >>> +    u32 is_present_in_sq;
> >>> +
> >>> +    /* If needs_sync is set in WI, sync *context_submit_sync_address ==
> >>> +     * context_submit_sync_value before submitting the context to HW
> >>> +     */
> >>> +    u32 context_submit_sync_value;
> >>> +    u32 context_submit_sync_address;
> >>> +
> >>> +    /* Reserved for SLPC hints (currently used for GT throttle
> >>> modes) */
> >>> +    u32 slpc_context_hints;
> >>
> >> you said that SLPC will be later ... is it necessary to put part of it
> >> here?
> >
> >"u32 future;" then?
> >
> >>
> >>> +
> >>> +    u32 reserved1[4];
> >>> +};
> >>> +
> >>> +/* Bitmap to track allocated and free contexts
> >>> + * context_alloct_bit_map[n] = 0; Context 'n' free
> >>> + * context_alloct_bit_map[n] = 1; Context 'n' allocated
> >>> + */
> >>> +struct guc_execlist_context_alloc_map {
> >>> +    /* Bit map for execlist contexts, bits 0 to
> >>> +     * (GUC_MAX_SW_CONTEXT_COUNTER - 1) are valid
> >>> +     */
> >>> +    u64 context_alloct_bit_map;
> >>
> >> maybe we should use GUC_MAX_SW_CONTEXT_COUNTER here:
> >>
> >>     u32 bitmap[GUC_MAX_SW_CONTEXT_COUNTER / sizeof(u32)];
> >>
> >
> >IPTCA
> >
> >>> +    u32 reserved;
> >>> +};
> >>> +
> >>> +enum guc_context_descriptor_type {
> >>> +    /* Work will be submitted through doorbell and WQ of a
> >>> +     * Proxy Submission descriptor in the context desc pool
> >>> +     */
> >>> +    GUC_CONTEXT_DESCRIPTOR_TYPE_PROXY_ENTRY = 0x00,
> >>> +
> >>> +    /* Work will be submitted using doorbell and workqueue
> >>> +     * of this descriptor on behalf of other proxy Entries
> >>> +     * in the context desc pool
> >>> +     */
> >>> +    GUC_CONTEXT_DESCRIPTOR_TYPE_PROXY_SUBMISSION = 0x01,
> >>> +
> >>> +    /* Work is submitted through its own doorbell and WQ */
> >>> +    GUC_CONTEXT_DESCRIPTOR_TYPE_REAL = 0x02,
> >>> +};
> >>> +
> >>> +/* CPU, Graphics and physical addresses */
> >>> +struct guc_address {
> >>> +    /* Cpu address (virtual) */
> >>> +    u64 p_cpu_address;
> >>> +    /* uKernel address (gfx) */
> >>> +    u32 p_uk_address;
> >>
> >> do we need to share cpu/uk addresses over ABI ?
> >
> >What is the alternative?
> >
> >>
> >>> +    /* Physical address */
> >>> +    u64 p_address_gpa;
> >>
> >> please drop p_ prefix
> >
> >IPTCA
> >
> >>
> >>> +};
> >>> +
> >>> +/* Context descriptor for communication between uKernel and KMD */
> >>> +struct guc_context_descriptor {
> >>> +    /* CPU back pointer for general KMD usage */
> >>> +    u64 assigned_guc_gpu_desc;
> >>
> >> private pointers shall not be part of the ABI
> >
> >Agreed in this particular case. Would you be OK with something like "u64
> >private_field" here?
> >
> >>
> >>> +
> >>> +    /* Index in the pool */
> >>> +    u32 guc_context_desc_pool_index;
> >>> +
> >>> +    /* For a Proxy Entry, this is the index of it's proxy submission
> >>> entry.
> >>> +     * For others this is the same as guc_context_desc_pool_index above
> >>> +     */
> >>> +    u32 proxy_submission_guc_context_desc_pool_index;
> >>> +
> >>> +    /* The doorbell page's trigger cacheline */
> >>> +    struct guc_address doorbell_trigger_address;
> >>> +
> >>> +    /* Assigned doorbell */
> >>> +    u32 doorbell_id;
> >>> +
> >>> +    /* Array of execlist contexts */
> >>> +    struct guc_execlist_context
> >>> +        uk_exec_list_context[GUC_MAX_SCHEDULABLE_ENGINE_CLASS]
> >>> +                    [GUC_MAX_SW_CONTEXT_COUNTER];
> >>> +
> >>> +    /* Allocation map to track which execlist contexts are in use */
> >>> +    struct guc_execlist_context_alloc_map
> >>> + uk_execlist_context_alloc_map[GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
> >>
> >> hmm, maybe to make it future proof, we should define array as last
> >> member?
> >
> >That would help very little, since this whole struct is also stored in
> >an array (something that we still have to improve), but IPTCA.
> >
> >>
> >>> +
> >>> +    /* Number of active execlist contexts */
> >>> +    u32 uk_execlist_context_alloc_count;
> >>> +
> >>> +    /* Optimization to reduce the maximum execlist context count for
> >>> +     * this GuC context descriptor. Should be less than
> >>> +     * GUC_MAX_SW_CONTEXT_COUNTER
> >>> +     */
> >>> +    u32 max_uk_execlist_context_per_engine_class;
> >>> +
> >>> +    union {
> >>> +        struct {
> >>> +            /* Is this context actively assigned to an app? */
> >>> +            u32 is_context_active : 1;
> >>> +
> >>> +            /* Is this a proxy entry, principal or real entry? */
> >>> +            u32 context_type : 2;
> >>> +
> >>> +            u32 is_kmd_created_context : 1;
> >>
> >> can it be modified from/or the kmd ?
> >
> >The KMD sets this. I think this only makes sense when Direct Submission
> >is used.
> >
> >>
> >>> +
> >>> +            /* Context was part of an engine reset. KMD must take
> >>> +             * appropriate action (this context will not be
> >>> +             * resubmitted until this bit is cleared)
> >>> +             */
> >>> +            u32 is_context_eng_reset : 1;
> >>> +
> >>> +             /* Set it to 1 to prevent other code paths to do work
> >>> +              * queue processing as we use sampled values for WQ
> >>> +              * processing. Allowing multiple code paths to do WQ
> >>> +              * processing will cause same workload to execute
> >>> +              * multiple times
> >>> +              */
> >>> +            u32 wq_processing_locked : 1;
> >>> +
> >>> +            u32 future : 1;
> >>> +
> >>> +            /* If set to 1, the context is terminated by GuC. All
> >>> +             * the pending work is dropped, its doorbell is evicted
> >>> +             * and eventually this context will be removed
> >>> +             */
> >>> +            u32 is_context_terminated : 1;
> >>> +
> >>> +            u32 : 24;
> >>> +        };
> >>> +        u32 bool_values;
> >>> +    };
> >>> +
> >>> +    enum guc_context_priority priority;
> >>
> >> again, enum in packed struct
> >
> >Yup, we already knew about this.
> >
> >>
> >>> +
> >>> +    /* WQ tail Sampled and set during doorbell ISR handler */
> >>> +    u32 wq_sampled_tail_offset;
> >>> +
> >>> +    /* Global (across all submit queues). For principals
> >>> +     * (proxy entry), this will be zero and true count
> >>> +     * will be reflected in its proxy (proxy submission)
> >>> +     */
> >>> +    u32 total_submit_queue_enqueues;
> >>> +
> >>> +    /* Pointer to struct guc_sched_process_descriptor */
> >>> +    u32 p_process_descriptor;
> >>> +
> >>> +    /* Secure copy of WQ address and size. uKernel can not
> >>> +     * trust data in struct guc_sched_process_descriptor
> >>> +     */
> >>> +    u32 p_work_queue_address;
> >>> +    u32 work_queue_size_bytes;
> >>> +
> >>> +    u32 future0;
> >>> +    u32 future1;
> Same comment on all "future x" fields. What is the purpose? 
> 
> 
> Anusha 
> >> drop or keep together with reserved
> >
> >Again, it was done like this to keep differences to a minimum.
> >
> >>
> >>> +
> >>> +    struct guc_engine_class_bit_map queue_engine_error;
> >>
> >> is this error map per context? is there global error map available?
> >
> >I'm not very familiar with the GuC reset procedure, but this since to be
> >per-context and I don't see any global equivalent thing.
> >
> >>
> >>> +
> >>> +    u32 reserved0[3];
> >>> +    u64 reserved1[12];
> >>> +};
> >>> +
> >>>
> >+/****************************************************************
> >*************
> >>>
> >>> + *************************** Host2GuC and GuC2Host
> >>> ***************************
> >>> +
> >>>
> >******************************************************************
> >***********/
> >>> +
> >>> +/* Host 2 GuC actions */
> >>> +enum guc_host2guc_action {
> >>
> >> to be future ready:
> >>
> >> s/guc_host2guc_action/guc_action
> >> s/GUC_HOST2GUC_ACTION/GUC_ACTION
> >> or
> >> s/guc_host2guc_action/guc_msg_action
> >> s/GUC_HOST2GUC_ACTION/GUC_MSG_ACTION
> >> or
> >> s/guc_host2guc_action/guc_request_action
> >> s/GUC_HOST2GUC_ACTION/GUC_REQUEST_ACTION
> >
> >IPTCA
> >
> >>
> >>
> >>> +    GUC_HOST2GUC_ACTION_DEFAULT = 0x0,
> >>> +    GUC_HOST2GUC_ACTION_REQUEST_INIT_DONE_INTERRUPT = 0x1,
> >>> +    GUC_HOST2GUC_ACTION_REQUEST_PREEMPTION = 0x2,
> >>> +    GUC_HOST2GUC_ACTION_REQUEST_ENGINE_RESET = 0x3,
> >>> +    GUC_HOST2GUC_ACTION_PAUSE_SCHEDULING = 0x4,
> >>> +    GUC_HOST2GUC_ACTION_RESUME_SCHEDULING = 0x5,
> >>> +
> >>> +    GUC_HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
> >>> +    GUC_HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
> >>> +    GUC_HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
> >>> +    GUC_HOST2GUC_ACTION_ENABLE_LOGGING = 0x40,
> >>> +    GUC_HOST2GUC_ACTION_CACHE_CRASH_DUMP = 0x200,
> >>> +    GUC_HOST2GUC_ACTION_DEBUG_RING_DB = 0x300,
> >>> +    GUC_HOST2GUC_ACTION_PERFORM_GLOBAL_DEBUG_ACTIONS = 0x301,
> >>> +    GUC_HOST2GUC_ACTION_FORCE_LOGBUFFERFLUSH = 0x302,
> >>> +    GUC_HOST2GUC_ACTION_LOG_VERBOSITY_LOGOUTPUT_SELECT =
> >0x400,
> >>> +    GUC_HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
> >>> +    GUC_HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
> >>> +    GUC_HOST2GUC_ACTION_SET_SCHEDULING_MODE = 0x504,
> >>> +    GUC_HOST2GUC_ACTION_SCHED_POLICY_CHANGE = 0x506,
> >>> +
> >>> +    /* Actions for Powr Conservation : 0x3000-0x3FFF */
> >>> +    GUC_HOST2GUC_ACTION_PC_SLPM_REQUEST = 0x3003,
> >>> +    GUC_HOST2GUC_ACTION_PC_SETUP_GUCRC = 0x3004,
> >>> +    GUC_HOST2GUC_ACTION_SAMPLE_FORCEWAKE_FEATURE_REGISTER =
> >0x3005,
> >>> +    GUC_HOST2GUC_ACTION_SETUP_GUCRC = 0x3006,
> >>> +
> >>> +    GUC_HOST2GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
> >>> +
> >>> +    GUC_HOST2GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER =
> >0x4505,
> >>> +    GUC_HOST2GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER
> >= 0x4506,
> >>> +
> >>> +    GUC_HOST2GUC_ACTION_MAX = 0xFFFF
> >>> +};
> >>> +
> >>> +enum guc_host2guc_response_status {
> >>> +    GUC_HOST2GUC_RESPONSE_STATUS_SUCCESS = 0x0,
> >>> +    GUC_HOST2GUC_RESPONSE_STATUS_UNKNOWN_ACTION = 0x30,
> >>> +    GUC_HOST2GUC_RESPONSE_STATUS_LOG_HOST_ADDRESS_NOT_VALID =
> >0x80,
> >>> +    GUC_HOST2GUC_RESPONSE_STATUS_GENERIC_FAIL = 0xF000,
> >>
> >> s/guc_host2guc_response_status/guc_status
> >> s/GUC_HOST2GUC_RESPONSE_STATUS/GUC_STATUS
> >> or
> >> s/guc_host2guc_response_status/guc_msg_status
> >> s/GUC_HOST2GUC_RESPONSE_STATUS/GUC_MSG_STATUS
> >> or
> >> s/guc_host2guc_response_status/guc_response_status
> >> s/GUC_HOST2GUC_RESPONSE_STATUS/GUC_RESPONSE_STATUS
> >>
> >
> >IPTCA
> >
> >>> +};
> >>> +
> >>> +enum {
> >>> +    /* Host originating types */
> >>> +    GUC_MSG_TYPE_HOST2GUC_REQUEST = 0x0,
> >>> +
> >>> +    /* GuC originating types */
> >>> +    GUC_MSG_TYPE_HOST2GUC_RESPONSE = 0xF,
> >>> +} GUC_GUC_MSG_TYPE;
> >>        ^^^^^^^
> >> typo?
> >
> >Problem with the automatic conversion, I'll fix it.
> >
> >> and HOST2GUC should be dropped
> >>
> >>     GUC_MSG_TYPE_REQUEST = 0x0,
> >>     GUC_MSG_TYPE_RESPONSE = 0xF,
> >>
> >> and it should defined before ACTION/STATUS
> >>
> >>> +
> >>> +/* This structure represents the various formats of values put in
> >>> + * SOFT_SCRATCH_0. The "Type" field is to determine which register
> >>> + * definition to use, so it must be common among all unioned
> >>> + * structs.
> >>> + */
> >>> +struct guc_msg_format {
> >>> +    union {
> >>> +        struct {
> >>> +            u32 action : 16; /* enum guc_host2guc_action */
> >>> +            u32 reserved : 12; /* MBZ */
> >>> +            u32 type : 4; /* GUC_MSG_TYPE_HOST2GUC_REQUEST */
> >>> +        } host2guc_action;
> >>> +
> >>> +        struct {
> >>> +            u32 status : 16; /* enum guc_host2guc_response_status */
> >>> +            u32 return_data : 12;
> >>> +            u32 type : 4; /* GUC_MSG_TYPE_HOST2GUC_RESPONSE */
> >>> +        } host2guc_response;
> >>> +
> >>> +        u32 dword_value;
> >>> +    };
> >>> +};
> >>
> >> well, we need just single struct definition:
> >>
> >>     union guc_msg_header {
> >>         struct {
> >>             u32 code : 16;
> >>             u32 data : 12;
> >>             u32 type : 4; /* GUC_MSG_TYPE */
> >>         };
> >>         u32 value;
> >>     };
> >>
> >> as value of 'type' field indicates how to interpret 'code/data' fields:
> >>
> >> if (msg.type == GUC_MSG_TYPE_REQUEST) {
> >>     GUC_REQUEST_ACTION action = msg.code;
> >> } else if (type == GUC_MSG_TYPE_RESPONSE) {
> >>     GUC_RESPONSE_STATUS status = msg.code;
> >> }
> >>
> >
> >IPTCA
> >
> >>> +
> >>> +#define GUC_MAKE_HOST2GUC_RESPONSE(_status, _return_data)    \
> >>> +    ((GUC_MSG_TYPE_HOST2GUC_RESPONSE << 28) |        \
> >>> +    ((_return_data & 0xFFF) << 16) |            \
> >>> +    (_status & 0xFFFF))
> >>> +#define GUC_MAKE_HOST2GUC_STATUS(a)
> >(GUC_MAKE_HOST2GUC_RESPONSE(a, 0))
> >>
> >> I'm not sure we need helper macros to be part of the ABI definition
> >
> >Agreed, I will remove this.
> >
> >>
> >>> +
> >>> +enum guc_cmd_transport_buffer_type {
> >>> +    GUC_CMD_TRANSPORT_BUFFER_HOST2GUC = 0x00,
> >>> +    GUC_CMD_TRANSPORT_BUFFER_GUC2HOST = 0x01,
> >>> +    GUC_CMD_TRANSPORT_BUFFER_MAX_TYPE = 0x02,
> >>
> >> where would we need MAX_TYPE ?
> >
> >No idea, you are the cmd transport buffer expert :)
> >IPTCA
> >
> >>
> >>> +};
> >>> +
> >>> +struct guc_cmd_transport_buffer_desc {
> >>> +    u32 buffer_begin_gfx_address;
> >>> +    u64 buffer_begin_virtual_address;
> >>
> >> virtual address in ABI again ?
> >>
> >>> +    u32 buffer_size_in_bytes;
> >>> +    /* GuC uKernel updates this */
> >>> +    u32 head_offset;
> >>> +    /* GuC client updates this */
> >>> +    u32 tail_offset;
> >>> +    u32 is_in_error;
> >>> +    /* A DW provided by H2G item that was requested to be written */
> >>> +    u32 fence_report_dw;
> >>> +    /* Status associated with above fence_report_dw */
> >>> +    u32 status_report_dw;
> >>
> >> hmm, as we usually setup both H2G and G2H buffers for req/resp, why do
> >> we need
> >> this additional mechanism to let GuC write fence/status into descriptor ?
> >
> >IPTCA
> >
> >>
> >>> +    /* ID associated with this buffer (assigned by GuC master) */
> >>> +    u32 client_id;
> >>> +    /* Used & set by the client for further tracking of internal
> >>> clients */
> >>> +    u32 client_sub_tracking_id;
> >>> +    u32 reserved[5];
> >>> +};
> >>> +
> >>> +/* Per client command transport buffer allocated by GuC master */
> >>> +struct guc_master_cmd_transport_buffer_alloc {
> >>> +    /* This is the copy that GuC trusts */
> >>> +    struct guc_cmd_transport_buffer_desc buffer_desc;
> >>> +    u32 future;
> >>> +    u64 reserved0;
> >>
> >> please keep future/reserved fields together at the end of struct
> >
> >I'll pass a general comment about this
> >>
> >>> +    u32 usage_special_info;
> >>
> >> what's the format of this ?
> >
> >No idea. IPTCA
> >
> >>
> >>> +    u32 valid;
> >>
> >> 32b for single flag ?
> >
> >IPTCA
> >
> >>
> >>> +    u32 associated_g2h_index;
> >>> +    u32 reserved1;
> >>> +};
> >>> +
> >>> +/*                             Host 2 GuC Work Item
> >>> + *
> >>> V-----------------------------------------------------------------------V
> >>> + *
> >>>
> >******************************************************************
> >*******
> >>> + * *                   *    DW0/   *           * *           *
> >>> + * * H2G Item Header   *  ReturnDW *  DW1      *      ... *  DWn      *
> >>> + *
> >>>
> >******************************************************************
> >*******
> >>> + */
> >>> +
> >>> +/* Command buffer header */
> >>> +struct guc_cmd_buffer_item_header {
> >>> +    union {
> >>> +        struct {
> >>> +            /* Number of dwords that are parameters of this
> >>> +             * Host2GuC action. Max of 31. E.g.: if there are 2 DWs
> >>> +             * following this header, this field is set to 2
> >>> +             */
> >>> +            u32 num_dwords : 5;
> >>> +
> >>> +            u32 : 3;
> >>> +
> >>> +            /* The uKernel will write the value from DW0 (aka
> >>> +             * ReturnDW) to fence_report_dw in struct
> >>> +             * guc_cmd_transport_buffer_desc
> >>> +             */
> >>> +            u32 write_fence_from_dw0_to_descriptor : 1;
> >>> +
> >>> +            /* Write the status of the action to DW0 following this
> >>> +             * header
> >>> +             */
> >>> +            u32 write_status_to_dw0 : 1;
> >>> +
> >>> +            /* Send a GuC2Host with Status of the action and the
> >>> +             * fence ID found in DW0 via the buffer used for GuC to
> >>> +             * Host communication
> >>> +             */
> >>> +            u32 send_status_with_dw0_via_guc_to_host : 1;
> >>> +
> >>> +            u32 : 5;
> >>> +
> >>> +            /*  This is the value of the enum guc_host2guc_action
> >>> +             * that needs to be done by the uKernel
> >>> +             */
> >>> +            u32 host2guc_action : 16;
> >>> +        } h2g_cmd_buffer_item_header;
> >>> +
> >>> +        struct {
> >>> +            /* Number of dwords that are parameters of this GuC2Host
> >>> +             * action
> >>> +             */
> >>> +            u32 num_dwords : 5;
> >>> +
> >>> +            u32 : 3;
> >>> +
> >>> +            /* Indicates that this GuC2Host action is a response of
> >>> +             * a Host2Guc request
> >>> +             */
> >>> +            u32 host2_guc_response : 1;
> >>> +
> >>> +            u32 reserved : 7;
> >>> +
> >>> +            /* struct guc_to_host_message */
> >>> +            u32 guc2host_action : 16;
> >>> +        } g2h_cmd_buffer_item_header;
> >>> +
> >>> +        struct {
> >>> +            u32 num_dwords : 5;
> >>> +            u32 reserved : 3;
> >>> +            u32 free_for_client_use : 24;
> >>> +        } generic_cmd_buffer_item_header;
> >>> +
> >>> +        u32 header_value;
> >>> +    };
> >>
> >> hmm, there was a long discussion about unifying CTB H2G and G2H messages,
> >> and final proposal/agreement was to use:
> >>
> >>     struct guc_ctb_header {
> >>         u32 num_dwords : 8;
> >>         u32 flags : 8;
> >>         u32 action_code : 16;
> >>     };
> >> and
> >>     #define CTB_FLAGS_NONE          0
> >>     #define CTB_FLAG_FENCE_PRESENT 1
> >> and
> >>     #define GUC_ACTION_RESPONSE    GUC_ACTION_DEFAULT
> >>
> >> then
> >> to send H2G request:
> >>     .action = ACTION (!= RESPONSE)
> >>     .flags |= FENCE_PRESENT
> >> to send H2G notification:
> >>     .action = ACTION (!= RESPONSE)
> >>     .flags = FLAGS_NONE
> >> to send G2H notification:
> >>     .action = ACTION (!= RESPONSE)
> >>     .flags = FLAGS_NONE
> >> to send G2H response:
> >>     .action = ACTION_RESPONSE
> >>     .flags |= FENCE_PRESENT
> >>
> >
> >IPTCA
> >
> >>> +
> >>> +};
> >>> +
> >>> +struct guc_to_host_message {
> >>> +    union {
> >>> +        struct {
> >>> +            u32 uk_init_done : 1;
> >>> +            u32 crash_dump_posted : 1;
> >>> +            u32 : 1;
> >>> +            u32 flush_log_buffer_to_file : 1;
> >>> +            u32 preempt_request_old_preempt_pending : 1;
> >>> +            u32 preempt_request_target_context_bad : 1;
> >>> +            u32 : 1;
> >>> +            u32 sleep_entry_in_progress : 1;
> >>> +            u32 guc_in_debug_halt : 1;
> >>> +            u32 guc_report_engine_reset_context_id : 1;
> >>> +            u32 : 1;
> >>> +            u32 host_preemption_complete : 1;
> >>> +            u32 reserved0 : 4;
> >>> +            u32 gpa_to_hpa_xlation_error : 1;
> >>> +            u32 doorbell_id_allocation_error : 1;
> >>> +            u32 doorbell_id_allocation_invalid_ctx_id : 1;
> >>> +            u32 : 1;
> >>> +            u32 force_wake_timed_out : 1;
> >>> +            u32 force_wake_time_out_counter : 2;
> >>> +            u32 : 1;
> >>> +            u32 iommu_cat_page_faulted : 1;
> >>> +            u32 host2guc_engine_reset_complete : 1;
> >>> +            u32 reserved1 : 2;
> >>> +            u32 doorbell_selection_error : 1;
> >>> +            u32 doorbell_id_release_error : 1;
> >>> +            u32 uk_exception : 1;
> >>> +            u32 : 1;
> >>> +        };
> >>> +        u32 dw;
> >>> +    };
> >>
> >> this error bitmap definition is more applicable to MMIO based
> >> notification, for CTB we could introduce something more flexible
> >>
> >> btw, even for MMIO based notification we can reuse:
> >>
> >>     union guc_msg_header {
> >>         struct {
> >>             u32 code : 16;
> >>             u32 data : 12;
> >>             u32 type : 4; /* GUC_MSG_TYPE */
> >>         };
> >>         u32 value;
> >>     };
> >>
> >> in SCRATCH(15) with new:
> >>
> >>     type = GUC_MSG_TYPE_NOTIF = 0x1,
> >>     type = GUC_MSG_TYPE_ERROR = 0xE,
> >>
> >> where code/data can hold extra details, for NOTIF:
> >>     uk_init_done = 1,
> >>     crash_dump_posted,
> >>     preempt_request_old_preempt_pending,
> >>     host_preemption_complete,
> >> for ERROR:
> >>     preempt_request_target_context_bad,
> >>     doorbell_selection_error,
> >> btw, some of these errors seem to be action specific,
> >> so maybe they should be reported back in STATUS ?
> >
> >IPTCA
> >
> >>
> >>> +
> >>> +};
> >>> +
> >>> +/* Size of the buffer to save GuC's state before S3. The ddress of
> >>> the buffer
> >>> + * goes in guc_additional_data_structs
> >>> + */
> >>> +#define GUC_MAX_GUC_S3_SAVE_SPACE_PAGES    10
> >>> +
> >>> +/* MMIO Offset for status of sleep state enter request */
> >>> +#define GUC_SLEEP_STATE_ENTER_STATUS    0xC1B8
> >>
> >> hmm, we used SCRATCH(14) for H2G, good to know that it will be used
> >> for G2H
> >>
> >>> +
> >>> +/* Status of sleep request. Value updated in
> >>> GUC_SLEEP_STATE_ENTER_STATUS */
> >>> +enum guc_sleep_state_enter_status {
> >>> +    GUC_SLEEP_STATE_ENTER_SUCCESS = 1,
> >>> +    GUC_SLEEP_STATE_ENTER_PREEMPT_TO_IDLE_FAILED = 2,
> >>> +    GUC_SLEEP_STATE_ENTER_ENG_RESET_FAILED = 3,
> >>> +};
> >>> +
> >>> +
> >>> +/* Enum to determine what mode the scheduler is in */
> >>> +enum guc_scheduler_mode {
> >>> +    GUC_SCHEDULER_MODE_NORMAL,
> >>> +    GUC_SCHEDULER_MODE_STALL_IMMEDIATE,
> >>> +};
> >>> +
> >>>
> >+/****************************************************************
> >*************
> >>>
> >>> + ********************************** Logging
> >>> **********************************
> >>> +
> >>>
> >******************************************************************
> >***********/
> >>> +
> >>
> >> maybe log interface can be defined in separate file ?
> >
> >IPTCA
> >
> >>
> >>> +enum guc_log_buffer_type {
> >>> +    GUC_LOG_BUFFER_TYPE_ISR = 0x0,
> >>> +    GUC_LOG_BUFFER_TYPE_DPC = 0x1,
> >>> +    GUC_LOG_BUFFER_TYPE_CRASH = 0x2,
> >>> +    GUC_LOG_BUFFER_TYPE_MAX = 0x3,
> >>> +};
> >>> +
> >>> +enum guc_log_verbosity {
> >>> +    GUC_LOG_VERBOSITY_LOW = 0x0,
> >>> +    GUC_LOG_VERBOSITY_MED = 0x1,
> >>> +    GUC_LOG_VERBOSITY_HIGH = 0x2,
> >>> +    GUC_LOG_VERBOSITY_ULTRA = 0x3,
> >>> +    GUC_LOG_VERBOSITY_MAX = 0x4,
> >>> +};
> >>> +
> >>> +/* This enum controls the type of logging output. Can be changed
> >>> dynamically
> >>> + * using GUC_HOST2GUC_ACTION_LOG_VERBOSITY_LOGOUTPUT_SELECT
> >>> + */
> >>> +enum guc_logoutput_selection {
> >>> +    GUC_LOGOUTPUT_LOGBUFFER_ONLY = 0x0,
> >>> +    GUC_LOGOUTPUT_NPK_ONLY = 0x1,
> >>> +    GUC_LOGOUTPUT_LOGBUFFER_AND_NPK = 0x2,
> >>> +    GUC_LOGOUTPUT_MAX
> >>> +};
> >>> +
> >>> +/* Filled by KMD except version and marker that are initialized by
> >>> uKernel */
> >>> +struct guc_km_log_buffer_state {
> >>> +    /* Marks the beginning of Buffer Flush (set by uKernel at Log
> >>> Buffer
> >>> +     * Init)
> >>> +     */
> >>> +    u32 marker[2];
> >>> +
> >>> +    /* This is the last byte offset location that was read by KMD.
> >>> KMD will
> >>> +     * write to this and uKernel will read it
> >>> +     */
> >>> +    u32 log_buf_rd_ptr;
> >>> +
> >>> +    /* This is the byte offset location that will be written by
> >>> uKernel */
> >>> +    u32 log_buf_wr_ptr;
> >>> +
> >>> +    u32 log_buf_size;
> >>> +
> >>> +    /* This is written by uKernel when it sees the log buffer
> >>> becoming half
> >>> +     * full. KMD writes this value in the log file to avoid stale data
> >>> +     */
> >>> +    u32 sampled_log_buf_wrptr;
> >>> +
> >>> +    union {
> >>> +        struct {
> >>> +            /* uKernel sets this when log buffer is half full or
> >>> +             * when a forced flush has been requested through
> >>> +             * Host2Guc. uKernel will send Guc2Host only if this
> >>> +             * bit is cleared. This is to avoid unnecessary
> >>> +             * interrupts from GuC
> >>> +             */
> >>> +            u32 log_buf_flush_to_file : 1;
> >>> +
> >>> +            /* uKernel increments this when log buffer overflows */
> >>> +            u32 buffer_full_count : 4;
> >>> +
> >>> +            u32 : 27;
> >>> +        };
> >>> +        u32 log_buf_flags;
> >>> +    };
> >>> +
> >>> +    u32 version;
> >>> +};
> >>> +
> >>> +/* Logging Parameters sent via struct sched_control_data. Maintained
> >>> as separate
> >>> + * structure to allow debug tools to access logs without contacting
> >>> GuC (for
> >>> + * when GuC is stuck in ISR)
> >>> + */
> >>> +struct guc_log_init_params {
> >>> +    union {
> >>> +        struct {
> >>> +            u32 is_log_buffer_valid : 1;
> >>> +            /* Raise GuC2Host interrupt when buffer is half full */
> >>> +            u32 notify_on_log_half_full : 1;
> >>> +            u32 : 1;
> >>> +            /* 0 = Pages, 1 = Megabytes */
> >>> +            u32 allocated_count_units : 1;
> >>> +            /* No. of units allocated -1 (MAX 4 Units) */
> >>> +            u32 crash_dump_log_allocated_count : 2;
> >>> +            /* No. of units allocated -1 (MAX 8 Units) */
> >>> +            u32 dpc_log_allocated_count : 3;
> >>> +            /* No. of units allocated -1 (MAX 8 Units) */
> >>> +            u32 isr_log_allocated_count : 3;
> >>> +            /* Page aligned address for log buffer */
> >>> +            u32 log_buffer_gfx_address : 20;
> >>> +        };
> >>> +        u32 log_dword_value;
> >>> +    };
> >>> +};
> >>> +
> >>> +/* Pass info for doing a Host2GuC request
> >>> (GUC_HOST2GUC_ACTION_ENABLE_LOGGING)
> >>> + * in order to enable/disable GuC logging
> >>> + */
> >>> +struct guc_log_enable_params {
> >>> +    union {
> >>> +        struct {
> >>> +            u32 logging_enabled : 1;
> >>> +            u32 profile_logging_enabled : 1;
> >>> +            u32 log_output_selection : 2;
> >>> +            u32 log_verbosity : 4;
> >>> +            u32 default_logging_enabled : 1;
> >>> +            u32 : 23;
> >>> +        };
> >>> +        u32 log_enable_dword_value;
> >>> +    };
> >>> +
> >>> +};
> >>> +
> >>>
> >+/****************************************************************
> >*************
> >>>
> >>> + ************** Sched Control Data and Addtional Data Structures
> >>> *************
> >>> +
> >>>
> >******************************************************************
> >***********/
> >>> +
> >>> +/* Holds the init values of various parameters used by the uKernel */
> >>> +struct guc_sched_control_data {
> >>
> >> is this good name? I can see log related params below...
> >
> >Agreed, the name choice is terrible. IPTCA.
> >
> >>
> >>> +    /* Dword 0 */
> >>> +    union {
> >>> +        struct {
> >>> +            /* Num of contexts in pool in blocks of 16,
> >>> +             * E.g.: num_contexts_in_pool16_blocks = 1 if 16
> >>> +             * contexts, 64 if 1024 contexts allocated
> >>> +             */
> >>> +            u32 num_contexts_in_pool16_blocks : 12;
> >>> +
> >>> +            /* Aligned bits [31:12] of the GFX address where the
> >>> +             * pool begins
> >>> +             */
> >>> +            u32 context_pool_gfx_address_begin : 20;
> >>> +        };
> >>> +    };
> >>> +
> >>> +    /* Dword 1 */
> >>> +    struct guc_log_init_params log_init_params;
> >>
> >> can we keep related data together ? dw4 also has log params
> >
> >I already made this same comment in the past. It simply hasn't been
> >incorporated yet.
> >>
> >>> +
> >>> +
> >>> +    /* Dword 2 */
> >>> +    union {
> >>> +        struct {
> >>> +            u32 reserved : 1;
> >>> +            u32 wa_disable_dummy_all_engine_fault_fix : 1;
> >>> +            u32 : 30;
> >>> +        };
> >>> +        u32 workaround_dw;
> >>> +    };
> >>> +
> >>> +    /* Dword 3 */
> >>> +    union {
> >>> +        struct {
> >>> +            u32 ftr_enable_preemption_data_logging : 1;
> >>> +            u32 ftr_enable_guc_pavp_control : 1;
> >>> +            u32 ftr_enable_guc_slpm : 1;
> >>> +            u32 ftr_enable_engine_reset_on_preempt_failure : 1;
> >>> +            u32 ftr_lite_restore : 1;
> >>> +            u32 ftr_driver_flr : 1;
> >>> +            u32 future : 1;
> >>> +            u32 ftr_enable_psmi_logging : 1;
> >>> +            u32 : 1;
> >>> +            u32 : 1;
> >>> +            u32 : 1;
> >>> +            u32 : 1;
> >>> +            u32 : 1;
> >>> +            u32 : 1;
> >>> +            u32 : 18;
> >>> +        };
> >>> +        u32 feature_dword;
> >>> +    };
> >>> +
> >>> +    /* Dword 4 */
> >>> +    union {
> >>> +        struct {
> >>> +            /* One of enum guc_log_verbosity */
> >>> +            u32 logging_verbosity : 4;
> >>> +            /* One of enum guc_logoutput_selection */
> >>> +            u32 log_output_selection : 2;
> >>> +            u32 logging_disabled : 1;
> >>> +            u32 profile_logging_enabled : 1;
> >>> +            u32 : 24;
> >>> +        };
> >>> +    };
> >>> +
> >>> +    /* Dword 5 */
> >>> +    union {
> >>> +        struct {
> >>> +            u32 : 1;
> >>> +            u32 gfx_address_additional_data_structs : 21;
> >>> +            u32 : 10;
> >>> +        };
> >>> +    };
> >>> +
> >>> +};
> >>> +
> >>> +/* Structure to pass additional information and structure pointers
> >>> to */
> >>> +struct guc_additional_data_structs {
> >>> +    /* Gfx ptr to struct guc_mmio_save_restore_list (persistent) */
> >>> +    u32 gfx_address_mmio_save_restore_list;
> >>> +
> >>> +    /* Buffer of size GUC_MAX_GUC_S3_SAVE_SPACE_PAGES (persistent) */
> >>> +    u32 gfx_ptr_to_gucs_state_save_buffer;
> >>> +
> >>> +    /* Gfx addresses of struct guc_scheduling_policies
> >>> (non-persistent, may
> >>> +     * be released after initial load), NULL or valid = 0 flag value
> >>> will
> >>> +     * cause default policies to be loaded
> >>> +     */
> >>> +    u32 gfx_scheduler_policies;
> >>> +
> >>> +    /* Gfx address of struct guc_gt_system_info */
> >>> +    u32 gt_system_info;
> >>> +
> >>> +    u32 future;
> >>> +
> >>> +    u32 gfx_ptr_to_psmi_log_control_data;
> >>> +
> >>> +    /* LRCA addresses and sizes of golden contexts (persistent) */
> >>> +    u32 gfx_golden_context_lrca[GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
> >>> +    u32
> >>>
> >golden_context_eng_state_size_in_bytes[GUC_MAX_SCHEDULABLE_ENGINE_CL
> >ASS];
> >>
> >> maybe this is could be more future friendly if we define it as the last
> >> member:
> >>
> >>     struct {
> >>         u32 lrca_addrs;
> >>         u32 eng_state_size_in_bytes;
> >>     } gfx_golden_context[GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
> >
> >IPTCA
> >
> >>
> >>> +
> >>> +    u32 reserved[16];
> >>> +};
> >>> +
> >>> +/* Max number of mmio per engine class per engine instance */
> >>> +#define GUC_MAX_MMIO_PER_SET    64
> >>> +
> >>> +struct guc_mmio_flags {
> >>> +    union {
> >>> +        struct {
> >>> +            u32 masked : 1;
> >>> +            u32 : 31;
> >>> +        };
> >>> +        u32 flags_value;
> >>> +    };
> >>> +};
> >>> +
> >>> +struct guc_mmio {
> >>> +    u32 offset;
> >>> +    u32 value;
> >>> +    struct guc_mmio_flags flags;
> >>> +};
> >>> +
> >>> +struct guc_mmio_set {
> >>> +    /* Array of mmio to be saved/restored */
> >>> +    struct guc_mmio mmio[GUC_MAX_MMIO_PER_SET];
> >>> +    /* Set after saving mmio value, cleared after restore. */
> >>> +    u32 mmio_values_valid;
> >>> +     /* Number of mmio in the set */
> >>> +    u32 number_of_mmio;
> >>> +};
> >>> +
> >>> +struct guc_mmio_save_restore_list {
> >>> +    struct guc_mmio_set
> >>> +        node_mmio_set[GUC_MAX_SCHEDULABLE_ENGINE_CLASS]
> >>> +                 [GUC_MAX_ENGINE_INSTANCE_PER_CLASS];
> >>> +    u32 reserved[98];
> >>> +};
> >>> +
> >>> +/* Policy flags to control scheduling decisions */
> >>> +struct guc_scheduling_policy_flags {
> >>> +    union {
> >>> +        struct {
> >>> +            /* Should we reset engine when preemption failed within
> >>> +             * its time quantum
> >>> +             */
> >>> +            u32 reset_engine_upon_preempt_failure : 1;
> >>> +
> >>> +            /* Should we preempt to idle unconditionally for the
> >>> +             * execution quantum expiry
> >>> +             */
> >>> +            u32 preempt_to_idle_on_quantum_expiry : 1;
> >>> +
> >>> +            u32 : 30;
> >>> +        };
> >>> +        u32 policy_dword;
> >>> +    };
> >>> +};
> >>> +
> >>> +/* Per-engine class and per-priority struct for scheduling policy  */
> >>> +struct guc_scheduling_policy {
> >>> +    /* Time for one workload to execute (micro seconds) */
> >>> +    u32 execution_quantum;
> >>> +
> >>> +    /* Time to wait for a preemption request to completed before
> >>> issuing a
> >>> +     * reset (micro seconds)
> >>> +     */
> >>> +    u32 wait_for_preemption_completion_time;
> >>> +
> >>> +    /* How much time to allow to run after the first fault is observed.
> >>> +     * Then preempt afterwards (micro seconds)
> >>> +     */
> >>> +    u32 quantum_upon_first_fault_time;
> >>> +
> >>> +    struct guc_scheduling_policy_flags policy_flags;
> >>> +
> >>> +    u32 reserved[8];
> >>> +};
> >>> +
> >>> +/* KMD should populate this struct and pass info through struct
> >>> + * guc_additional_data_structs- If KMD does not set the scheduler
> >>> policy,
> >>> + * uKernel will fall back to default scheduling policies
> >>> + */
> >>> +struct guc_scheduling_policies {
> >>> +    struct guc_scheduling_policy
> >>> +
> >per_submit_queue_policy[GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT]
> >>> +                       [GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
> >>> +
> >>> +    /* Submission queue depth, min 2, max 8. If outside the valid
> >>> range,
> >>> +     * default value is used
> >>> +     */
> >>> +    u32 submission_queue_depth[GUC_MAX_SCHEDULABLE_ENGINE_CLASS];
> >>
> >> as it looks that most of the arrays are indexed by
> >> GUC_MAX_SCHEDULABLE_ENGINE_CLASS then maybe all data structures
> >> should be designed around engine like:
> >>
> >>     struct guc_engine {
> >>         u8 class;
> >>         u8 instance;
> >>
> >>         u32 submission_queue_depth;
> >>         struct guc_scheduling_policy
> >> policy[GUC_CONTEXT_PRIORITY_ABSOLUTE_MAX_COUNT];
> >>     };
> >
> >IPTCA
> >
> >>> +
> >>> +    /* How much time to allow before DPC processing is called back via
> >>> +     * interrupt (to prevent DPC queue drain starving) IN micro
> >>> seconds.
> >>> +     * Typically in the 1000s (example only, not granularity)
> >>> +     */
> >>> +    u32 dpc_promote_time;
> >>> +
> >>> +    /* Must be set to take these new values */
> >>> +    u32 is_valid;
> >>> +
> >>> +    /* Number of WIs to process per call to process single. Process
> >>> single
> >>> +     * could have a large Max Tail value which may keep CS idle.
> >>> Process
> >>> +     * max_num_work_items_per_dpc_call WIs and try fast schedule
> >>> +     */
> >>> +    u32 max_num_work_items_per_dpc_call;
> >>> +
> >>> +    u32 reserved[4];
> >>> +};
> >>> +
> >>> +#pragma pack()
> >>> +
> >>> +#endif
> >>
> >>
> >> Thanks,
> >> Michal
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx