[Intel-xe] [PATCH 13/21] drm/xe/uapi: Multiplex PERF ops through a single PERF ioctl
Dixit, Ashutosh
ashutosh.dixit at intel.com
Thu Oct 5 23:18:03 UTC 2023
On Thu, 05 Oct 2023 11:27:08 -0700, Umesh Nerlige Ramappa wrote:
>
> On Thu, Oct 05, 2023 at 08:22:30AM -0700, Dixit, Ashutosh wrote:
> > On Wed, 04 Oct 2023 22:27:14 -0700, Dixit, Ashutosh wrote:
> >>
> >> On Tue, 03 Oct 2023 19:23:24 -0700, Umesh Nerlige Ramappa wrote:
> >> >
> >> > On Tue, Sep 19, 2023 at 09:10:41AM -0700, Ashutosh Dixit wrote:
> >> > > Since we are already mulitplexing multiple perf counter stream types
> >> > > through the PERF layer, it seems odd to retain separate ioctls for perf
> >> > > op's (add/remove config). In fact it seems logical to also multiplex these
> >> > > ops through a single PERF ioctl. This also affords greater flexibility to
> >> > > add stream specific ops if needed for different perf stream types.
> >> > >
> >> > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit at intel.com>
> >> > > ---
> >> > > drivers/gpu/drm/xe/xe_device.c | 5 +----
> >> > > drivers/gpu/drm/xe/xe_perf.c | 32 ++++++++------------------------
> >> > > drivers/gpu/drm/xe/xe_perf.h | 4 +---
> >> > > include/uapi/drm/xe_drm.h | 16 ++++++++++------
> >> > > 4 files changed, 20 insertions(+), 37 deletions(-)
> >> > >
> >> > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> >> > > index 770b9fe6e65df..24018a0801788 100644
> >> > > --- a/drivers/gpu/drm/xe/xe_device.c
> >> > > +++ b/drivers/gpu/drm/xe/xe_device.c
> >> > > @@ -115,10 +115,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
> >> > > DRM_RENDER_ALLOW),
> >> > > DRM_IOCTL_DEF_DRV(XE_VM_MADVISE, xe_vm_madvise_ioctl, DRM_RENDER_ALLOW),
> >> > >
> >> > > - DRM_IOCTL_DEF_DRV(XE_PERF_OPEN, xe_perf_open_ioctl, DRM_RENDER_ALLOW),
> >> > > - DRM_IOCTL_DEF_DRV(XE_PERF_ADD_CONFIG, xe_perf_add_config_ioctl, DRM_RENDER_ALLOW),
> >> > > - DRM_IOCTL_DEF_DRV(XE_PERF_REMOVE_CONFIG, xe_perf_remove_config_ioctl, DRM_RENDER_ALLOW),
> >> > > -
> >> > > + DRM_IOCTL_DEF_DRV(XE_PERF, xe_perf_ioctl, DRM_RENDER_ALLOW),
> >> > > };
> >> > >
> >> > > static const struct file_operations xe_driver_fops = {
> >> > > diff --git a/drivers/gpu/drm/xe/xe_perf.c b/drivers/gpu/drm/xe/xe_perf.c
> >> > > index 0f747af59f245..f8d7eae8fffe0 100644
> >> > > --- a/drivers/gpu/drm/xe/xe_perf.c
> >> > > +++ b/drivers/gpu/drm/xe/xe_perf.c
> >> > > @@ -6,37 +6,21 @@
> >> > > #include "xe_oa.h"
> >> > > #include "xe_perf.h"
> >> > >
> >> > > -int xe_perf_open_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> >> > > +int xe_oa_ioctl(struct drm_device *dev, struct drm_xe_perf_param *arg, struct drm_file *file)
> >> > > {
> >> > > - struct drm_xe_perf_param *arg = data;
> >> > > -
> >> > > - if (arg->extensions)
> >> > > - return -EINVAL;
> >> > > -
> >> > > - switch (arg->perf_type) {
> >> > > - case XE_PERF_TYPE_OA:
> >> > > + switch (arg->perf_op) {
> >> > > + case XE_PERF_STREAM_OPEN:
> >> > > return xe_oa_stream_open_ioctl(dev, (void *)arg->param, file);
> >> >
> >> > It's a nice idea to reduce the ioctls, but if your struct drm_xe_perf_param
> >> > *arg is overloaded based on the PERF_OP passed, then I would recommend
> >> > validating that the right arg is passed for the corresponding OP.
> >>
> >> I am not following what you mean here: which right arg for which OP?
> >>
> >> The PERF layer only demultiplexes based on perf_type (say OA/XYZ etc.). The
> >> perf_op belongs to the perf_type layer (say OA), not the PERF layer. It is
> >> the job of the perf_type layer (OA) to validate the perf_op, not the job of
> >> the PERF layer. It is just convenient to include the perf_op as part of
> >> 'struct drm_xe_perf_param' (rather than inventing yet another layer there).
> >> See the function xe_perf_ioctl() in the patch.
> >>
> >> The xe_oa_ioctl function above could possibly be moved into xe_oa.c. I just
> >> left it in xe_perf.c since it didn't seem to matter much. But I am open to
> >> doing that.
> >
> > OK, I think I figured out the right way to visualize this. It's as
> > follows. Let's say we have a an OA stream inside the PERF layer. So what we
> > have is:
> >
> > struct drm_xe_perf_param {
> > perf_type;
> >
> > struct oa {
> > oa_op;
> >
> > struct oa_op_params {
> > ...
> > }
> > }
> > }
> >
> > So basically I have eliminated 'struct oa' and merged into 'struct
> > drm_xe_perf_param'. But oa_op still belongs to the OA layer, not the PERF
> > layer. So the oa layer handles the oa_op not the PERF layer.
Umesh brought up an excellent point. When UMD and KMD versions are not the
same and struct sizes change, UMD and KMD have different values for 'struct
oa_op_params' above. To handle this situation, we will add a 'size' field
to 'struct oa' above (in reality to 'struct drm_xe_perf_param'). UMD can
fill in this field to indicate its notion of the 'struct oa_op_params'
size.
This should enable the kernel to take handle the discrepancy in the param
size between userspace and kernel (if needed, possibly in the future). It
may also be possible to move the additional copy_from_user into the perf
layer, similar to what drm_ioctl() does. But we will start by adding 'size'
to the header.
> >> > Ideally I wouldn't go that route since that would require some sort of
> >> > signature in the arg which would identify it as the correct
> >> > param. Instead I would be okay with retaining separate ioctls for the 3
> >> > operations.
> >>
> >> If we were not doing this multiplexing based on perf_type (as in i915) we
> >> could have separate ioctl's for each operation. But since here we have
> >> anyway introduced a multiplxing layer, to me it makes no sense to have
> >> separate operation ioctl's (only disadvantags and no advantages). (Note
> >> that the multiplexing layer implies a (non-obvious) additional
> >> copy_from_user per operation visible in the previous "drm/xe/uapi: "Perf"
> >> layer to support multiple perf counter stream types" patch).
> >
> > The drm layer does a copy_from_user for the first layer but any second
> > layer structs need to be copy_from_user'd by the driver.
> >
> >>
> >> Also we cannot assume that a future stream type will only have 3 operations
> >> as i915 OA did. The OPEN/ADD_CONFIG/CLOSE are really OA specific
> >> operations. But it appears other potential perf_type's will also be able to
> >> use them, at least initially that is why they are left defined as PERF_OP's
> >> (rather than OA_OP's) in xe_drm.h. New stream types are free to introduce
> >> new ops in this design.
> >>
> >> So retaining the ops inside a single PERF ioctl eliminates the need for
> >> introducing a new ioctl each time a stream type introduces a new OP.
>
> I think I misunderstood. This is fine as long as the underlying layer is
> able to validate the arguments.
More information about the Intel-xe
mailing list