[Mesa-dev] [RFC 0/6] i965: INTEL_performance_query re-work

Samuel Pitoiset samuel.pitoiset at gmail.com
Wed May 6 01:36:15 PDT 2015

On 05/06/2015 02:53 AM, Robert Bragg wrote:
> As we've learned more about the observability capabilities of Gen
> graphics we've found that it's not enough to only try and configure the
> OA unit from userspace without any dedicated support from the kernel.

Hi Robert,

Yeah, this is the same idea for performance counters on Nouveau.

We also need to implement a dedicated support from the kernel for
configuring/sampling hardware performance counters. Then, we can
expose a list of available counters through a set of ioctls. Thus, mesa
configures a hardware event by sending its "configuration" to the kernel.

> As it is currently the i965 backends for both AMD_performance_monitor
> and INTEL_performance_query aren't able to report normalized metrics
> useful to application developers due to the limitations of configuring
> the OA unit from userspace via LRIs.
> More recently we've developed a perf PMU (performance monitoring unit)
> driver within the drm i915 driver ("i915_oa") that lets userspace
> configure and open an event fd via the perf_event_open syscall which
> provides us a more complete interface for configuring the Gen graphics
> OA unit.
> With help from the kernel we can support periodic sampling (where the
> hardware writes reports into a gpu mapped circular buffer that we can
> forward as perf samples), we can deal with the clock gating + PM
> limitations imposed by the observability hw and also manage + maintain
> the selection of performance counters.
> The perf_event_open(2) man page is a good starting point for anyone
> wanting to learn about the Linux perf interface. Something to beware of
> is that there's currently no precedent upstream for exposing device
> metrics via a perf PMU and although early feedback was sought for this
> work, some of this may be subject to change based on feedback from the
> core perf maintainers as well as the i915 drm driver maintainers.

Performance counters on Nouveau won't be exposed (in the near future)
by perf since they need to be tied to the command stream of the GPU,
and perf only works with ioctl calls.

> This PRM is a good starting point for anyone wanting to learn about the
> Gen graphics Observability hardware. Some important information is
> currently missing and this should be updated soon, but that's more
> directly related to the i915_oa perf driver. Notably though the report
> formats described here need to be understood by Mesa, since the perf
> samples simply forward the raw reports from the OA hardware.
> https://01.org/sites/default/files/documentation/
> observability_performance_counters_haswell.pdf
> This series re-works the i965 driver's support for exposing performance
> counters, taking advantage of this i915_oa perf event interface.
> A corresponding kernel branch with an initial i915_oa driver for Haswell
> can be found here:
> https://github.com/rib/linux  wip/rib/oa-hsw-4.0.0
> A corresponding libdrm branch can be found here:
> https://github.com/rib/drm  wip/rib/oa-hsw-4.0.0
> In case it's helpful to see another example using the i915_oa perf
> interface I've also been developing a 'gputop' tool that both lets me
> test the INTEL_performance_query interface to collect per-context
> metrics from Mesa and can also visualize system wide metrics (i.e.
> across all gpu contexts) using perf directly:
> https://github.com/rib/gputop

This is pretty good for testing OA counters without mesa.

> Although I haven't updated the branches in a while, I could share some
> initial code adding support for Broadwell if anyone's interested to get
> a sense of what's involved in supporting later hardware generations.
> I still anticipate some (hopefully relatively minor) tweaking of
> implementation details based on review feedback for the i915_oa driver,
> but I hope that this is a good point to ask for some feedback on the
> Mesa changes.
> If it's more convenient, these patches can also be fetched from here:
> https://github.com/rib/mesa  wip/rib/oa-hsw-4.0.0

Great work Robert. :-)

I'll try to give you my feedback in the next few days.

> Regards,
> - Robert
> Robert Bragg (6):
>    i965: Remove perf monitor/query backend
>    Separate INTEL_performance_query frontend
>    Model INTEL perf query backend after query object BE
>    i965: Implement INTEL_performance_query extension
>    i965: Expose OA counters via INTEL_performance_query
>    i965: Adds further support for "3D" OA counters
>   src/mapi/glapi/gen/gl_genexec.py                   |    1 +
>   src/mesa/Makefile.sources                          |    2 +
>   src/mesa/drivers/dri/i965/Makefile.sources         |    2 +-
>   src/mesa/drivers/dri/i965/brw_context.c            |    5 +-
>   src/mesa/drivers/dri/i965/brw_context.h            |  101 +-
>   .../drivers/dri/i965/brw_performance_monitor.c     | 1472 ------------
>   src/mesa/drivers/dri/i965/brw_performance_query.c  | 2356 ++++++++++++++++++++
>   src/mesa/drivers/dri/i965/intel_batchbuffer.c      |   10 +-
>   src/mesa/drivers/dri/i965/intel_extensions.c       |   69 +-
>   src/mesa/main/context.c                            |    3 +
>   src/mesa/main/dd.h                                 |   39 +
>   src/mesa/main/mtypes.h                             |   28 +
>   src/mesa/main/performance_monitor.c                |  579 -----
>   src/mesa/main/performance_monitor.h                |   39 -
>   src/mesa/main/performance_query.c                  |  608 +++++
>   src/mesa/main/performance_query.h                  |   80 +
>   16 files changed, 3197 insertions(+), 2197 deletions(-)
>   delete mode 100644 src/mesa/drivers/dri/i965/brw_performance_monitor.c
>   create mode 100644 src/mesa/drivers/dri/i965/brw_performance_query.c
>   create mode 100644 src/mesa/main/performance_query.c
>   create mode 100644 src/mesa/main/performance_query.h

More information about the mesa-dev mailing list