[PATCH v12 0/8] Add support for EU stall sampling

Dixit, Ashutosh ashutosh.dixit at intel.com
Wed Feb 26 19:43:22 UTC 2025


On Tue, 25 Feb 2025 18:34:04 -0800, Dixit, Ashutosh wrote:
>
> On Tue, 25 Feb 2025 17:47:04 -0800, Harish Chegondi wrote:
> >
> > The following patch series add support for EU stall sampling,
> > a new hardware feature first added in PVC and is being supported
> > in XE2 and later architecture GPUs. This feature would enable
> > capturing of EU stall data which include the IP address of the
> > instruction stalled and various stall reason counts.
> >
> > Support for this feature is being added into Mesa:
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142
> >
> > New IGT tests for EU stall sampling are being added:
> > https://patchwork.freedesktop.org/series/143030/
> >
> > This patch series has undergone basic testing with the new IGT tests.
> >
> > Issues that need investigation:
> > 1. Blocked reads with small user buffers may be blocked even with EU
> > stall data in the kernel buffer as a previous read has set pollin to
> > false even when kernel buffer has data that could be read but the user
> > buffer is too small to read all the data.
>
> The series is now completey reviewed. Planning to merge this once CI
> completes on this version:

Done! EU stall series is now merged upstream.

>
> Reviewed-by: Ashutosh Dixit <ashutosh.dixit at intel.com>
>
> >
> > Thank You.
> >
> > v12 a. Move check for EU stall support to a header file
> >     b. Move 'goto exit_drop;' to the next if statement
> > v11 a. Lock optimization
> >     b. Moved around code as per review feedback
> > v10 a. Fixed error rewinding code
> >     b. Used cancel_delayed_work_sync() instead of flush_delayed_work()
> >     c. Replaced per xecore lock with a lock for all the xecore buffers
> >     d. Remove function description for static functions.
> >     e. Use extension number while parsing chain of extensions.
> >     f. Moved code around as per review feedback
> > v9: a. Split the big patch in v8 into two patches
> >     b. Moved all drop data handling code into one patch
> >     c. Several other code improvements as mentioned in the patches
> > v8: a. Used div_u64() instead of / to fix 32-bit build issue.
> >     b. Changed copyright year in new files to 2025.
> >     c. Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc
> >     d. Renamed struct drm_xe_eu_stall_data_xe2 to struct xe_eu_stall_data_xe2
> >
> > v7: a. Renamed input property DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT
> >        to DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS to be consistent with
> >        OA. Renamed the corresponding internal variables.
> >     b. Fixed some commit messages based on review feedback.
> >     c. Changed sampling_rates from a pointer to flexible array.
> >
> > v6: a. Changed the uAPI input to accept sampling rate in GPU cycles
> >        instead of sampling rate multiplier.
> >     b. Fix buffer wrap around over write bug (Matt Olson).
> >     c. Include EU stall sampling rates information and per XeCore buffer size in the query information.
> >
> > v5: Addressed review feedback from v4 including
> >     a. Removed DRM_XE_EU_STALL_PROP_POLL_PERIOD from the uAPI (Ashutosh)
> >     b. Separated the patches for Xe_HPC and Xe2 (Matt R)
> >     c. Moved read() returning -EIO into a separate patch
> >     d. Removed spinlocks around set_bit() and clear_bit() (Matt R)
> >     e. Renamed several variables, structures and enums (Ashutosh and
> > Matt R)
> >     f. Addressed other review feedback.
> > v4: Addressed review feedback from v3 including
> >     a. Split the patch into multiple patches (Matt R)
> >     b. Added a new device query to get EU stall info (Ashutosh)
> >     c. Renamed all Dss to xecore (Matt R)
> >     d. Removed buffer size and disable at open input properties. (Matt R)
> >     e. Removed the "_SHIFT" macros (Matt R)
> >     f. Allocate the EU stall buffer only on system memory.
> >     g. Changed the work arounds to OOB (Matt R)
> >     h. Other review feedback.
> > v3: a. Removed data header and changed read() to return -EIO when data is dropped by the HW.
> >     b. Added a new DRM_XE_OBSERVATION_IOCTL_INFO to query EU stall data record info
> >     c. Added struct drm_xe_eu_stall_data_pvc and struct drm_xe_eu_stall_data_xe2
> >        to xe_drm.h. These declarations would help user space to parse the
> >        EU stall data
> >     d. Addressed other review comments from v2
> > v2: Rename xe perf layer as xe observation layer (Ashutosh)
> >
> > Test-with: cover.1739901972.git.harish.chegondi at intel.com
> >
> > Reviewed-by: Ben Olson <matthew.olson at intel.com>
> > Acked-by: Felix Degrood <felix.j.degrood at intel.com>
> > Signed-off-by: Harish Chegondi <harish.chegondi at intel.com>
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit at intel.com>
> >
> > Harish Chegondi (8):
> >   drm/xe/topology: Add a function to find the index of the last enabled
> >     DSS in a mask
> >   drm/xe/uapi: Introduce API for EU stall sampling
> >   drm/xe/eustall: Add support to init, enable and disable EU stall
> >     sampling
> >   drm/xe/eustall: Add support to read() and poll() EU stall data
> >   drm/xe/eustall: Add support to handle dropped EU stall data
> >   drm/xe/eustall: Add EU stall sampling support for Xe2
> >   drm/xe/uapi: Add a device query to get EU stall sampling information
> >   drm/xe/eustall: Add workaround 22016596838 which applies to PVC.
> >
> >  drivers/gpu/drm/xe/Makefile                |   1 +
> >  drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h |  29 +
> >  drivers/gpu/drm/xe/xe_eu_stall.c           | 960 +++++++++++++++++++++
> >  drivers/gpu/drm/xe/xe_eu_stall.h           |  24 +
> >  drivers/gpu/drm/xe/xe_gt.c                 |   5 +
> >  drivers/gpu/drm/xe/xe_gt_topology.h        |  13 +
> >  drivers/gpu/drm/xe/xe_gt_types.h           |   3 +
> >  drivers/gpu/drm/xe/xe_observation.c        |  14 +
> >  drivers/gpu/drm/xe/xe_query.c              |  43 +
> >  drivers/gpu/drm/xe/xe_trace.h              |  30 +
> >  drivers/gpu/drm/xe/xe_wa_oob.rules         |   1 +
> >  include/uapi/drm/xe_drm.h                  |  74 ++
> >  12 files changed, 1197 insertions(+)
> >  create mode 100644 drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h
> >  create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.c
> >  create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.h
> >
> > --
> > 2.48.1
> >


More information about the Intel-xe mailing list