[PATCH v8 5/7] drm/xe/eustall: Add EU stall sampling support for Xe2
Olson, Matthew
matthew.olson at intel.com
Wed Feb 5 19:03:00 UTC 2025
On Tue, Feb 04, 2025 at 05:57:17PM -0800, Dixit, Ashutosh wrote:
> On Tue, 04 Feb 2025 17:16:00 -0800, Olson, Matthew wrote:
> >
>
> Hi Matt,
>
> > On Wed, Jan 29, 2025 at 08:55:42PM -0800, Dixit, Ashutosh wrote:
> > > On Wed, 15 Jan 2025 12:02:11 -0800, Harish Chegondi wrote:
> > > > diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c
> > > > index 437782f8433c..d72f80a9dfe4 100644
> > > > --- a/drivers/gpu/drm/xe/xe_eu_stall.c
> > > > +++ b/drivers/gpu/drm/xe/xe_eu_stall.c
> > > > @@ -73,6 +73,42 @@ struct xe_eu_stall_data_pvc {
> > > > __u64 unused[6];
> > > > } __packed;
> > > >
> > > > +/**
> > > > + * struct xe_eu_stall_data_xe2 - EU stall data format for LNL, BMG
> > > > + *
> > > > + * Bits Field
> > > > + * 0 to 28 IP (addr)
> > > > + * 29 to 36 Tdr count
> > > > + * 37 to 44 other count
> > > > + * 45 to 52 control count
> > > > + * 53 to 60 pipestall count
> > > > + * 61 to 68 send count
> > > > + * 69 to 76 dist_acc count
> > > > + * 77 to 84 sbid count
> > > > + * 85 to 92 sync count
> > > > + * 93 to 100 inst_fetch count
> > > > + * 101 to 108 Active count
> > > > + * 109 to 111 Exid
> > > > + * 112 EndFlag (is always 1)
> > > > + */
> > > > +struct xe_eu_stall_data_xe2 {
> > > > + __u64 ip_addr:29;
> > > > + __u64 tdr_count:8;
> > > > + __u64 other_count:8;
> > > > + __u64 control_count:8;
> > > > + __u64 pipestall_count:8;
> > > > + __u64 send_count:8;
> > > > + __u64 dist_acc_count:8;
> > > > + __u64 sbid_count:8;
> > > > + __u64 sync_count:8;
> > > > + __u64 inst_fetch_count:8;
> > > > + __u64 active_count:8;
> > > > + __u64 ex_id:3;
> > > > + __u64 end_flag:1;
> > > > + __u64 unused_bits:15;
> > > > + __u64 unused[6];
> > > > +} __packed;
> > >
> > > Same question about whether or not to retain this struct. Retain it if we
> > > want to document this information otherwise drop it and just keep sizeof.
> >
> > I'd prefer to keep them, as I've personally found it convenient to refer
> > to them while while writing the userspace reader of these samples. I'm not
> > aware of any other particular place that they can be found, other than
> > maybe some other public repo that uses the i915 version of this interface
> > (IGT, maybe?). I'd venture to guess that others trying to call this
> > code are also going to be searching for these definitions in
> > `drivers/gpu/drm/xe` as well.
>
> Yes, they are present in the IGT's too:
>
> https://patchwork.freedesktop.org/patch/630656/?series=143030&rev=1
>
> Would that work for you, or you prefer them in the kernel? Just trying to
> get an idea right now, not deciding one way or another.
I think it'd be more convenient to have them in the Xe driver itself, since most
userspace users are going to already be looking there. They'd have to really
Google around to find those definitions in IGT. I understand that it cleans up
the code a bit to remove them (since we're only using their size), but keeping
them makes this code more self-documenting.
>
> Thanks.
> --
> Ashutosh
>
>
>
> >
> > >
> > > > +
> > > > static u64 per_xecore_buf_size = SZ_512K;
> > > >
> > > > static unsigned long
> > > > @@ -83,6 +119,8 @@ xe_eu_stall_data_record_size(struct xe_device *xe)
> > > >
> > > > if (platform == XE_PVC)
> > > > record_size = sizeof(struct xe_eu_stall_data_pvc);
> > > > + else if ((platform == XE_LUNARLAKE) || (platform == XE_BATTLEMAGE))
> > >
> > > 'else if (GRAPHICS_VER(xe) >= 20)' so that we don't have to keep adding
> > > each individual platform.
> > >
> > > > + record_size = sizeof(struct xe_eu_stall_data_xe2);
> > > >
> > > > return record_size;
> > > > }
> > > > @@ -311,10 +349,16 @@ eu_stall_data_buf_check(struct xe_eu_stall_data_stream *stream)
> > > > static void
> > > > clear_dropped_eviction_line_bit(struct xe_gt *gt, u16 group, u16 instance)
> > > > {
> > > > + struct xe_device *xe = gt_to_xe(gt);
> > > > u32 write_ptr_reg;
> > > >
> > > > - /* On PVC, the overflow bit has to be cleared by writing 1 to it. */
> > > > - write_ptr_reg = _MASKED_BIT_ENABLE(XEHPC_EUSTALL_REPORT_OVERFLOW_DROP);
> > > > + /* On PVC, the overflow bit has to be cleared by writing 1 to it.
> > > > + * On other GPUs, the bit has to be cleared by writing 0 to it.
> > > > + */
> > > > + if (GRAPHICS_VER(xe) >= 20)
> > > > + write_ptr_reg = _MASKED_BIT_DISABLE(XEHPC_EUSTALL_REPORT_OVERFLOW_DROP);
> > > > + else
> > > > + write_ptr_reg = _MASKED_BIT_ENABLE(XEHPC_EUSTALL_REPORT_OVERFLOW_DROP);
> > > >
> > > > xe_gt_mcr_unicast_write(gt, XEHPC_EUSTALL_REPORT, write_ptr_reg, group, instance);
> > > > }
> > > > @@ -882,7 +926,9 @@ static const struct file_operations fops_eu_stall = {
> > > >
> > > > static inline bool has_eu_stall_sampling_support(struct xe_device *xe)
> > > > {
> > > > - return ((xe->info.platform == XE_PVC) ? true : false);
> > > > + return ((xe->info.platform == XE_PVC ||
> > > > + xe->info.platform == XE_LUNARLAKE ||
> > > > + xe->info.platform == XE_BATTLEMAGE) ? true : false);
> > >
> > > Same here, use (GRAPHICS_VER(xe) >= 20).
> > >
> > > > }
> > > >
> > > > /**
> > > > --
> > > > 2.47.1
> > > >
More information about the Intel-xe
mailing list