[PATCH v3 1/6] drm/xe: Add LRC ctx timestamp support functions

Lucas De Marchi lucas.demarchi at intel.com
Mon Jun 10 17:42:46 UTC 2024


On Mon, Jun 10, 2024 at 02:10:38PM GMT, Matthew Brost wrote:
>On Mon, Jun 10, 2024 at 08:49:57AM -0500, Lucas De Marchi wrote:
>> On Fri, Jun 07, 2024 at 05:20:58PM GMT, Matthew Brost wrote:
>> > LRC ctx timestamp support functions are used to determine how long a job
>> > has run on the hardware.
>> >
>> > v2:
>> > - Don't use static inlines (Jani)
>> > - Kernel doc
>> > - s/ctx_timestamp_job/ctx_job_timestamp
>> >
>> > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>> > ---
>> > drivers/gpu/drm/xe/xe_lrc.c | 66 +++++++++++++++++++++++++++++++++++++
>> > drivers/gpu/drm/xe/xe_lrc.h |  5 +++
>> > 2 files changed, 71 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>> > index c1bb85d2e243..0fef354c6489 100644
>> > --- a/drivers/gpu/drm/xe/xe_lrc.c
>> > +++ b/drivers/gpu/drm/xe/xe_lrc.c
>> > @@ -652,6 +652,7 @@ u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc)
>> >
>> > #define LRC_SEQNO_PPHWSP_OFFSET 512
>> > #define LRC_START_SEQNO_PPHWSP_OFFSET (LRC_SEQNO_PPHWSP_OFFSET + 8)
>> > +#define LRC_CTX_JOB_TIMESTAMP_OFFSET (LRC_START_SEQNO_PPHWSP_OFFSET + 8)
>> > #define LRC_PARALLEL_PPHWSP_OFFSET 2048
>> > #define LRC_PPHWSP_SIZE SZ_4K
>> >
>> > @@ -680,6 +681,12 @@ static inline u32 __xe_lrc_start_seqno_offset(struct xe_lrc *lrc)
>> > 	return xe_lrc_pphwsp_offset(lrc) + LRC_START_SEQNO_PPHWSP_OFFSET;
>> > }
>> >
>> > +static u32 __xe_lrc_ctx_job_timestamp_offset(struct xe_lrc *lrc)
>> > +{
>> > +	/* The start seqno is stored in the driver-defined portion of PPHWSP */
>> > +	return xe_lrc_pphwsp_offset(lrc) + LRC_CTX_JOB_TIMESTAMP_OFFSET;
>> > +}
>> > +
>> > static inline u32 __xe_lrc_parallel_offset(struct xe_lrc *lrc)
>> > {
>> > 	/* The parallel is stored in the driver-defined portion of PPHWSP */
>> > @@ -691,6 +698,11 @@ static inline u32 __xe_lrc_regs_offset(struct xe_lrc *lrc)
>> > 	return xe_lrc_pphwsp_offset(lrc) + LRC_PPHWSP_SIZE;
>> > }
>> >
>> > +static u32 __xe_lrc_ctx_timestamp_offset(struct xe_lrc *lrc)
>> > +{
>> > +	return __xe_lrc_regs_offset(lrc) + CTX_TIMESTAMP * sizeof(u32);
>> > +}
>> > +
>> > static inline u32 __xe_lrc_indirect_ring_offset(struct xe_lrc *lrc)
>> > {
>> > 	/* Indirect ring state page is at the very end of LRC */
>> > @@ -716,11 +728,65 @@ DECL_MAP_ADDR_HELPERS(pphwsp)
>> > DECL_MAP_ADDR_HELPERS(seqno)
>> > DECL_MAP_ADDR_HELPERS(regs)
>> > DECL_MAP_ADDR_HELPERS(start_seqno)
>> > +DECL_MAP_ADDR_HELPERS(ctx_job_timestamp)
>> > +DECL_MAP_ADDR_HELPERS(ctx_timestamp)
>> > DECL_MAP_ADDR_HELPERS(parallel)
>> > DECL_MAP_ADDR_HELPERS(indirect_ring)
>> >
>> > #undef DECL_MAP_ADDR_HELPERS
>> >
>> > +/**
>> > + * xe_lrc_ctx_timestamp_ggtt_addr() - Get ctx timestamp GGTT address
>> > + * @lrc: Pointer to the lrc.
>> > + *
>> > + * Returns: ctx timestamp GGTT address
>> > + */
>> > +u32 xe_lrc_ctx_timestamp_ggtt_addr(struct xe_lrc *lrc)
>> > +{
>> > +	return __xe_lrc_ctx_timestamp_ggtt_addr(lrc);
>> > +}
>> > +
>> > +/**
>> > + * xe_lrc_ctx_timestamp_addr() - Read ctx timestamp value
>> > + * @lrc: Pointer to the lrc.
>> > + *
>> > + * Returns: ctx timestamp value
>> > + */
>> > +u32 xe_lrc_ctx_timestamp(struct xe_lrc *lrc)
>> > +{
>> > +	struct xe_device *xe = lrc_to_xe(lrc);
>> > +	struct iosys_map map;
>> > +
>> > +	map = __xe_lrc_ctx_timestamp_map(lrc);
>> > +	return xe_map_read32(xe, &map);
>> > +}
>> > +
>> > +/**
>> > + * xe_lrc_ctx_job_timestamp_ggtt_addr() - Get ctx job timestamp GGTT address
>> > + * @lrc: Pointer to the lrc.
>> > + *
>> > + * Returns: ctx timestamp job GGTT address
>> > + */
>> > +u32 xe_lrc_ctx_job_timestamp_ggtt_addr(struct xe_lrc *lrc)
>> > +{
>> > +	return __xe_lrc_ctx_job_timestamp_ggtt_addr(lrc);
>> > +}
>> > +
>> > +/**
>> > + * xe_lrc_ctx_job_timestamp_addr() - Read ctx job timestamp value
>> > + * @lrc: Pointer to the lrc.
>> > + *
>> > + * Returns: ctx timestamp job value
>> > + */
>> > +u32 xe_lrc_ctx_job_timestamp(struct xe_lrc *lrc)
>> > +{
>> > +	struct xe_device *xe = lrc_to_xe(lrc);
>> > +	struct iosys_map map;
>> > +
>> > +	map = __xe_lrc_ctx_job_timestamp_map(lrc);
>> > +	return xe_map_read32(xe, &map);
>> > +}
>> > +
>> > u32 xe_lrc_ggtt_addr(struct xe_lrc *lrc)
>> > {
>> > 	return __xe_lrc_pphwsp_ggtt_addr(lrc);
>> > diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>> > index 882c3437ba5c..001af6c79454 100644
>> > --- a/drivers/gpu/drm/xe/xe_lrc.h
>> > +++ b/drivers/gpu/drm/xe/xe_lrc.h
>> > @@ -94,6 +94,11 @@ void xe_lrc_snapshot_capture_delayed(struct xe_lrc_snapshot *snapshot);
>> > void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer *p);
>> > void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot);
>> >
>> > +u32 xe_lrc_ctx_timestamp_ggtt_addr(struct xe_lrc *lrc);
>> > +u32 xe_lrc_ctx_timestamp(struct xe_lrc *lrc);
>>
>
>Bad timing, just sent v4 at the same time.
>
>> I  think we have some clash here. See the function below where we
>> read the timestamp and cache it in the LRC. That one does something
>> slightly different, but apparently the same thing. Why are these
>> functions not using similar approch with xe_lrc_read_ctx_reg() rather
>> than defining new helper functions? And why would we return the address
>> and then having another function to read the value?
>
>xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP); is the same as: xe_lrc_ctx_timestamp(lrc);
>
>I used 'DECL_MAP_ADDR_HELPERS' to implement the 4 exported functions
>mainly as I need the GGTT address and that macro spits out a helper
>with the GGTT address.

I was still missing the reason to export the address, but after applying
it locally and grepping I see: emit_copy_timestamp(), so it's accessed
from the GPU side.

>
>fwiw, also xe_lrc_read_ctx_reg is implemented with one of these
>functions from DECL_MAP_ADDR_HELPERS - __xe_lrc_regs_map.
>
>I'd vote, leave patch mainly as is but replace existing
>xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP); with
>xe_lrc_ctx_timestamp(lrc);.

sounds good. While at it, please add a line in xe_lrc_update_timestamp()
doc that it's only intended to be called by places maintaining the
per-client run_ticks. When I merged that patch I wasn't thinking of
having other places reading the CTX_TIMESTAMP. Now that you exposed
other functions, I don't want to have people incorrectly calling
xe_lrc_update_timestamp() when they should be calling the new functions
you are exporting.

>
>If you want to cleanup the file itself and drop these macros, that seems
>to be an orthogonal issue to my series.

not now, at least not from my side... I don't see a pressing need for
that removal.

thanks
Lucas De Marchi


More information about the Intel-xe mailing list