[Mesa-dev] [RFC 0/6] i965: INTEL_performance_query re-work

Robert Bragg robert at sixbynine.org
Tue May 12 09:18:20 PDT 2015


Petri Latval wrote some piglit test for the core
INTEL_performance_query behaviour (i.e not assuming anything about the
semantics of the counters themselves) that he sent to the list some
time last year, but unfortunately they fell through the cracks and
never landed.

The other week I revived those to check my code before sending it out,
I just need to fix up the commit messages to use Petri's comments +
keep his authorship and I can re send those to the list.

Regards,
- Robert

On Mon, May 11, 2015 at 4:09 PM, Samuel Pitoiset
<samuel.pitoiset at gmail.com> wrote:
> Did you write piglit tests according to what the spec says, btw ?
>
> On 05/06/2015 02:53 AM, Robert Bragg wrote:
>>
>> As we've learned more about the observability capabilities of Gen
>> graphics we've found that it's not enough to only try and configure the
>> OA unit from userspace without any dedicated support from the kernel.
>>
>> As it is currently the i965 backends for both AMD_performance_monitor
>> and INTEL_performance_query aren't able to report normalized metrics
>> useful to application developers due to the limitations of configuring
>> the OA unit from userspace via LRIs.
>>
>> More recently we've developed a perf PMU (performance monitoring unit)
>> driver within the drm i915 driver ("i915_oa") that lets userspace
>> configure and open an event fd via the perf_event_open syscall which
>> provides us a more complete interface for configuring the Gen graphics
>> OA unit.
>>
>> With help from the kernel we can support periodic sampling (where the
>> hardware writes reports into a gpu mapped circular buffer that we can
>> forward as perf samples), we can deal with the clock gating + PM
>> limitations imposed by the observability hw and also manage + maintain
>> the selection of performance counters.
>>
>> The perf_event_open(2) man page is a good starting point for anyone
>> wanting to learn about the Linux perf interface. Something to beware of
>> is that there's currently no precedent upstream for exposing device
>> metrics via a perf PMU and although early feedback was sought for this
>> work, some of this may be subject to change based on feedback from the
>> core perf maintainers as well as the i915 drm driver maintainers.
>>
>> This PRM is a good starting point for anyone wanting to learn about the
>> Gen graphics Observability hardware. Some important information is
>> currently missing and this should be updated soon, but that's more
>> directly related to the i915_oa perf driver. Notably though the report
>> formats described here need to be understood by Mesa, since the perf
>> samples simply forward the raw reports from the OA hardware.
>>
>> https://01.org/sites/default/files/documentation/
>> observability_performance_counters_haswell.pdf
>>
>> This series re-works the i965 driver's support for exposing performance
>> counters, taking advantage of this i915_oa perf event interface.
>>
>> A corresponding kernel branch with an initial i915_oa driver for Haswell
>> can be found here:
>>
>> https://github.com/rib/linux  wip/rib/oa-hsw-4.0.0
>>
>> A corresponding libdrm branch can be found here:
>>
>> https://github.com/rib/drm  wip/rib/oa-hsw-4.0.0
>>
>> In case it's helpful to see another example using the i915_oa perf
>> interface I've also been developing a 'gputop' tool that both lets me
>> test the INTEL_performance_query interface to collect per-context
>> metrics from Mesa and can also visualize system wide metrics (i.e.
>> across all gpu contexts) using perf directly:
>>
>> https://github.com/rib/gputop
>>
>> Although I haven't updated the branches in a while, I could share some
>> initial code adding support for Broadwell if anyone's interested to get
>> a sense of what's involved in supporting later hardware generations.
>>
>> I still anticipate some (hopefully relatively minor) tweaking of
>> implementation details based on review feedback for the i915_oa driver,
>> but I hope that this is a good point to ask for some feedback on the
>> Mesa changes.
>>
>> If it's more convenient, these patches can also be fetched from here:
>>
>> https://github.com/rib/mesa  wip/rib/oa-hsw-4.0.0
>>
>> Regards,
>> - Robert
>>
>> Robert Bragg (6):
>>    i965: Remove perf monitor/query backend
>>    Separate INTEL_performance_query frontend
>>    Model INTEL perf query backend after query object BE
>>    i965: Implement INTEL_performance_query extension
>>    i965: Expose OA counters via INTEL_performance_query
>>    i965: Adds further support for "3D" OA counters
>>
>>   src/mapi/glapi/gen/gl_genexec.py                   |    1 +
>>   src/mesa/Makefile.sources                          |    2 +
>>   src/mesa/drivers/dri/i965/Makefile.sources         |    2 +-
>>   src/mesa/drivers/dri/i965/brw_context.c            |    5 +-
>>   src/mesa/drivers/dri/i965/brw_context.h            |  101 +-
>>   .../drivers/dri/i965/brw_performance_monitor.c     | 1472 ------------
>>   src/mesa/drivers/dri/i965/brw_performance_query.c  | 2356
>> ++++++++++++++++++++
>>   src/mesa/drivers/dri/i965/intel_batchbuffer.c      |   10 +-
>>   src/mesa/drivers/dri/i965/intel_extensions.c       |   69 +-
>>   src/mesa/main/context.c                            |    3 +
>>   src/mesa/main/dd.h                                 |   39 +
>>   src/mesa/main/mtypes.h                             |   28 +
>>   src/mesa/main/performance_monitor.c                |  579 -----
>>   src/mesa/main/performance_monitor.h                |   39 -
>>   src/mesa/main/performance_query.c                  |  608 +++++
>>   src/mesa/main/performance_query.h                  |   80 +
>>   16 files changed, 3197 insertions(+), 2197 deletions(-)
>>   delete mode 100644 src/mesa/drivers/dri/i965/brw_performance_monitor.c
>>   create mode 100644 src/mesa/drivers/dri/i965/brw_performance_query.c
>>   create mode 100644 src/mesa/main/performance_query.c
>>   create mode 100644 src/mesa/main/performance_query.h
>>
>


More information about the mesa-dev mailing list