[PATCH v8 3/7] drm/xe/eustall: Implement EU stall sampling APIs for Xe_HPC

Dixit, Ashutosh ashutosh.dixit at intel.com
Sat Jan 18 02:34:28 UTC 2025


On Wed, 15 Jan 2025 12:02:09 -0800, Harish Chegondi wrote:
>

Hi Harish,

I've just started reviewing this large patch. Just sending out initial
comments for now in case you want to start looking at the coments and work
on it meanwhile. I will pick up reviewing it again next week.

> Implement EU stall sampling APIs introduced in the previous patch for
> Xe_HPC (PVC). Add register definitions and the code that accesses these
> registers to the APIs.
>
> Add initialization and clean up functions and their implementations,
> EU stall enable and disable functions, poll() and read() implementations.
>
> A timer thread periodically polls the EU stall data buffer write pointer
> registers to look for any new data and caches the write pointer. The read
> function compares the cached read and write pointers and copies any new
> data to the user space.
>
> v8: Updated copyright year in xe_eu_stall_regs.h to 2025.
>     Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc
>     since it is a local structure.
> v6: Fix buffer wrap around over write bug (Matt Olson)

Once again, overall advice for the patch: think carefully if a static
function needs documentation. And if it needs documentation, does it need
each parameter described, even if that conveys 0 information. My
recommendation is to only include documentation for functions called by
other files (function exported via the .h). At least let's not follow a
rule where every function has a header.

This is towards kernel recommendation of minimizing documentation and
having self-documenting code.

>
> Signed-off-by: Harish Chegondi <harish.chegondi at intel.com>
> ---
>  drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h |  29 +
>  drivers/gpu/drm/xe/xe_eu_stall.c           | 759 ++++++++++++++++++++-
>  drivers/gpu/drm/xe/xe_eu_stall.h           |  43 ++
>  drivers/gpu/drm/xe/xe_gt.c                 |   6 +
>  drivers/gpu/drm/xe/xe_gt_types.h           |   3 +
>  drivers/gpu/drm/xe/xe_trace.h              |  33 +
>  6 files changed, 847 insertions(+), 26 deletions(-)

This patch is also a little bit too big. I think at least it can be split
into two: the read and the initialization parts can be separated
out. Anyway for now I will plod on with this review.

>  create mode 100644 drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h
>
> diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c
> index 48dcc7cb7791..c388d733b857 100644
> --- a/drivers/gpu/drm/xe/xe_eu_stall.c
> +++ b/drivers/gpu/drm/xe/xe_eu_stall.c
> @@ -8,15 +8,27 @@
>  #include <linux/poll.h>
>  #include <linux/fs.h>
>
> +#include <drm/drm_drv.h>
>  #include <uapi/drm/xe_drm.h>
>
> +#include "xe_bo.h"
> +#include "xe_pm.h"
> +#include "xe_trace.h"
>  #include "xe_macros.h"
>  #include "xe_device.h"
> +#include "xe_gt_mcr.h"
>  #include "xe_eu_stall.h"
>  #include "xe_gt_printk.h"
> +#include "xe_force_wake.h"
>  #include "xe_gt_topology.h"
>  #include "xe_observation.h"

#includes should be in alphabetical order. See other files.

>
> +#include "regs/xe_gt_regs.h"
> +#include "regs/xe_eu_stall_regs.h"
> +
> +#define POLL_FREQUENCY_HZ 100
> +#define POLL_PERIOD_NS (NSEC_PER_SEC / POLL_FREQUENCY_HZ)
> +
>  /**
>   * struct eu_stall_open_properties - EU stall sampling properties received
>   *				     from user space at open.
> @@ -31,6 +43,50 @@ struct eu_stall_open_properties {
>	struct xe_gt *gt;
>  };
>
> +/**
> + * struct xe_eu_stall_data_pvc - EU stall data format for PVC
> + *
> + * Bits		Field
> + * 0  to 28	IP (addr)
> + * 29 to 36	active count
> + * 37 to 44	other count
> + * 45 to 52	control count
> + * 53 to 60	pipestall count
> + * 61 to 68	send count
> + * 69 to 76	dist_acc count
> + * 77 to 84	sbid count
> + * 85 to 92	sync count
> + * 93 to 100	inst_fetch count
> + */
> +struct xe_eu_stall_data_pvc {
> +	__u64 ip_addr:29;
> +	__u64 active_count:8;
> +	__u64 other_count:8;
> +	__u64 control_count:8;
> +	__u64 pipestall_count:8;
> +	__u64 send_count:8;
> +	__u64 dist_acc_count:8;
> +	__u64 sbid_count:8;
> +	__u64 sync_count:8;
> +	__u64 inst_fetch_count:8;
> +	__u64 unused_bits:27;
> +	__u64 unused[6];
> +} __packed;

What is the purpose of including these data structs in the code? Afaict
they are not used anywhere, except for sizeof()? You might as well as just
store the sizeof in a table?

> +
> +static u64 per_xecore_buf_size = SZ_512K;
> +
> +static unsigned long
> +xe_eu_stall_data_record_size(struct xe_device *xe)
> +{
> +	enum xe_platform platform = xe->info.platform;

Don't use unnecessary variables, just use xe->info.platform.

> +	unsigned long record_size = 0;

Don't initialize I think, add an else with xe_gt_WARN_ON, if that works.

> +
> +	if (platform == XE_PVC)

Use GRAPHICS_VERx100 or GRAPHICS_VER, so that later you can just say 'if
(GRAPHICS_VER(xe) >= 20)' to avoid changing code for each platform.

PVC is GRAPHICS_VERx100(xe) 1260, though just for PVC you can use
xe->info.platform.

> @@ -265,19 +982,9 @@ int xe_eu_stall_stream_open(struct drm_device *dev,
>		return -EINVAL;
>	}
>
> -	if (xe_observation_paranoid && !perfmon_capable()) {
> -		xe_gt_dbg(props.gt, "Insufficient privileges for EU stall monitoring\n");
> -		return -EACCES;
> -	}
> +	mutex_lock(&props.gt->eu_stall->lock);
> +	ret = xe_eu_stall_stream_open_locked(dev, &props, file);
> +	mutex_unlock(&props.gt->eu_stall->lock);
>
> -	if (!has_eu_stall_sampling_support(xe)) {
> -		xe_gt_dbg(props.gt, "EU stall monitoring is not supported on this platform\n");
> -		return -EPERM;
> -	}
> -	stream_fd = anon_inode_getfd("[xe_eu_stall]", &fops_eu_stall,
> -				     NULL, 0);
> -	if (stream_fd < 0)
> -		xe_gt_dbg(props.gt, "EU stall inode get fd failed : %d\n", stream_fd);
> -
> -	return stream_fd;

As I mentioned in the other patch, we need to see why is so much code
moving around? Generally we should not have to delete lines of code in
these patches.

Ashutosh


More information about the Intel-xe mailing list