[PATCH v8 5/7] drm/xe/eustall: Add EU stall sampling support for Xe2
Dixit, Ashutosh
ashutosh.dixit at intel.com
Wed Feb 5 20:02:33 UTC 2025
On Wed, 05 Feb 2025 11:03:00 -0800, Olson, Matthew wrote:
>
> On Tue, Feb 04, 2025 at 05:57:17PM -0800, Dixit, Ashutosh wrote:
> > On Tue, 04 Feb 2025 17:16:00 -0800, Olson, Matthew wrote:
> > >
> >
> > Hi Matt,
> >
> > > On Wed, Jan 29, 2025 at 08:55:42PM -0800, Dixit, Ashutosh wrote:
> > > > On Wed, 15 Jan 2025 12:02:11 -0800, Harish Chegondi wrote:
> > > > > diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c
> > > > > index 437782f8433c..d72f80a9dfe4 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_eu_stall.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_eu_stall.c
> > > > > @@ -73,6 +73,42 @@ struct xe_eu_stall_data_pvc {
> > > > > __u64 unused[6];
> > > > > } __packed;
> > > > >
> > > > > +/**
> > > > > + * struct xe_eu_stall_data_xe2 - EU stall data format for LNL, BMG
> > > > > + *
> > > > > + * Bits Field
> > > > > + * 0 to 28 IP (addr)
> > > > > + * 29 to 36 Tdr count
> > > > > + * 37 to 44 other count
> > > > > + * 45 to 52 control count
> > > > > + * 53 to 60 pipestall count
> > > > > + * 61 to 68 send count
> > > > > + * 69 to 76 dist_acc count
> > > > > + * 77 to 84 sbid count
> > > > > + * 85 to 92 sync count
> > > > > + * 93 to 100 inst_fetch count
> > > > > + * 101 to 108 Active count
> > > > > + * 109 to 111 Exid
> > > > > + * 112 EndFlag (is always 1)
> > > > > + */
> > > > > +struct xe_eu_stall_data_xe2 {
> > > > > + __u64 ip_addr:29;
> > > > > + __u64 tdr_count:8;
> > > > > + __u64 other_count:8;
> > > > > + __u64 control_count:8;
> > > > > + __u64 pipestall_count:8;
> > > > > + __u64 send_count:8;
> > > > > + __u64 dist_acc_count:8;
> > > > > + __u64 sbid_count:8;
> > > > > + __u64 sync_count:8;
> > > > > + __u64 inst_fetch_count:8;
> > > > > + __u64 active_count:8;
> > > > > + __u64 ex_id:3;
> > > > > + __u64 end_flag:1;
> > > > > + __u64 unused_bits:15;
> > > > > + __u64 unused[6];
> > > > > +} __packed;
> > > >
> > > > Same question about whether or not to retain this struct. Retain it if we
> > > > want to document this information otherwise drop it and just keep sizeof.
> > >
> > > I'd prefer to keep them, as I've personally found it convenient to refer
> > > to them while while writing the userspace reader of these samples. I'm not
> > > aware of any other particular place that they can be found, other than
> > > maybe some other public repo that uses the i915 version of this interface
> > > (IGT, maybe?). I'd venture to guess that others trying to call this
> > > code are also going to be searching for these definitions in
> > > `drivers/gpu/drm/xe` as well.
> >
> > Yes, they are present in the IGT's too:
> >
> > https://patchwork.freedesktop.org/patch/630656/?series=143030&rev=1
> >
> > Would that work for you, or you prefer them in the kernel? Just trying to
> > get an idea right now, not deciding one way or another.
>
> I think it'd be more convenient to have them in the Xe driver itself, since most
> userspace users are going to already be looking there. They'd have to really
> Google around to find those definitions in IGT. I understand that it cleans up
> the code a bit to remove them (since we're only using their size), but keeping
> them makes this code more self-documenting.
All right. Harish, let's leave them in.
Thanks.
--
Ashutosh
> >
> >
> >
> > >
> > > >
> > > > > +
> > > > > static u64 per_xecore_buf_size = SZ_512K;
> > > > >
> > > > > static unsigned long
> > > > > @@ -83,6 +119,8 @@ xe_eu_stall_data_record_size(struct xe_device *xe)
> > > > >
> > > > > if (platform == XE_PVC)
> > > > > record_size = sizeof(struct xe_eu_stall_data_pvc);
> > > > > + else if ((platform == XE_LUNARLAKE) || (platform == XE_BATTLEMAGE))
> > > >
> > > > 'else if (GRAPHICS_VER(xe) >= 20)' so that we don't have to keep adding
> > > > each individual platform.
> > > >
> > > > > + record_size = sizeof(struct xe_eu_stall_data_xe2);
> > > > >
> > > > > return record_size;
> > > > > }
> > > > > @@ -311,10 +349,16 @@ eu_stall_data_buf_check(struct xe_eu_stall_data_stream *stream)
> > > > > static void
> > > > > clear_dropped_eviction_line_bit(struct xe_gt *gt, u16 group, u16 instance)
> > > > > {
> > > > > + struct xe_device *xe = gt_to_xe(gt);
> > > > > u32 write_ptr_reg;
> > > > >
> > > > > - /* On PVC, the overflow bit has to be cleared by writing 1 to it. */
> > > > > - write_ptr_reg = _MASKED_BIT_ENABLE(XEHPC_EUSTALL_REPORT_OVERFLOW_DROP);
> > > > > + /* On PVC, the overflow bit has to be cleared by writing 1 to it.
> > > > > + * On other GPUs, the bit has to be cleared by writing 0 to it.
> > > > > + */
> > > > > + if (GRAPHICS_VER(xe) >= 20)
> > > > > + write_ptr_reg = _MASKED_BIT_DISABLE(XEHPC_EUSTALL_REPORT_OVERFLOW_DROP);
> > > > > + else
> > > > > + write_ptr_reg = _MASKED_BIT_ENABLE(XEHPC_EUSTALL_REPORT_OVERFLOW_DROP);
> > > > >
> > > > > xe_gt_mcr_unicast_write(gt, XEHPC_EUSTALL_REPORT, write_ptr_reg, group, instance);
> > > > > }
> > > > > @@ -882,7 +926,9 @@ static const struct file_operations fops_eu_stall = {
> > > > >
> > > > > static inline bool has_eu_stall_sampling_support(struct xe_device *xe)
> > > > > {
> > > > > - return ((xe->info.platform == XE_PVC) ? true : false);
> > > > > + return ((xe->info.platform == XE_PVC ||
> > > > > + xe->info.platform == XE_LUNARLAKE ||
> > > > > + xe->info.platform == XE_BATTLEMAGE) ? true : false);
> > > >
> > > > Same here, use (GRAPHICS_VER(xe) >= 20).
> > > >
> > > > > }
> > > > >
> > > > > /**
> > > > > --
> > > > > 2.47.1
> > > > >
More information about the Intel-xe
mailing list