[igt-dev] [PATCH 4/5] tools/i915-perf: Add mmapped OA buffer support to i915-perf-recorder
Umesh Nerlige Ramappa
umesh.nerlige.ramappa at intel.com
Tue Aug 24 19:14:06 UTC 2021
On Mon, Aug 23, 2021 at 06:05:05PM -0700, Dixit, Ashutosh wrote:
>On Tue, 03 Aug 2021 13:07:36 -0700, Umesh Nerlige Ramappa wrote:
>>
>> Currently report from OA buffer are read from the perf_fd. The kernel
>> patches enable mmaping the OA buffer into user space to allow for faster
>> report queries across different platforms and engines.
>>
>> Enable OA buffer to be mmaped by the recorder tool based on command line
>> option -M.
>
>Not completely reviewed yet but some changes are needed, please see below.
>
>> +static int gem_set_domain(int fd, uint32_t handle, uint32_t read, uint32_t write)
>> +{
>> + struct drm_i915_gem_set_domain set_domain = {
>> + .handle = handle,
>> + .read_domains = read,
>> + .write_domain = write,
>> + };
>> + int ret = 0;
>> +
>> + if (perf_ioctl(fd, DRM_IOCTL_I915_GEM_SET_DOMAIN, &set_domain))
>
>set_domain is not available for discrete, see IGT gem_set_domain().
>
>> +static void *gem_mmap_cpu(int fd, uint32_t handle, uint64_t offset, uint64_t size,
>> + unsigned int prot)
>> +{
>> + struct drm_i915_gem_mmap arg = {
>> + .handle = handle,
>> + .offset = offset,
>> + .size = size,
>> + .addr_ptr = 0,
>> + .flags = 0,
>> + };
>> +
>> + if (perf_ioctl(fd, DRM_IOCTL_I915_GEM_MMAP, &arg))
>
>This needs to be changed to mmap_offset, DRM_IOCTL_I915_GEM_MMAP has been
>discontinued for future products.
>
>> +static void
>> +bb_emit_srm(struct bb_context *bb, uint32_t reg, uint32_t devid)
>> +{
>> + bool gen8_plus = devid >= 8;
>> +
>> + assert(bb->reloc_idx < ARRAY_SIZE(bb->reloc));
>> + assert(bb->offset < BATCH_SIZE);
>> +
>> + bb->batch[bb->offset++] = gen8_plus ? MI_STORE_REGISTER_MEM_GEN8 :
>> + MI_STORE_REGISTER_MEM;
>> + bb->batch[bb->offset++] = reg;
>> +
>> + bb->reloc[bb->reloc_idx].target_handle = bb->obj[0].handle;
>> + bb->reloc[bb->reloc_idx].presumed_offset = bb->obj[0].offset;
>> + bb->reloc[bb->reloc_idx].offset = bb->offset * sizeof(uint32_t);
>> + bb->reloc[bb->reloc_idx].delta = bb->reloc_idx * sizeof(uint32_t);
>> + bb->reloc[bb->reloc_idx].read_domains = I915_GEM_DOMAIN_RENDER;
>> + bb->reloc[bb->reloc_idx].write_domain = I915_GEM_DOMAIN_RENDER;
>> +
>> + bb->batch[bb->offset++] = bb->reloc[bb->reloc_idx].delta;
>> + if (gen8_plus)
>> + bb->batch[bb->offset++] = 0;
>
>Relocations are also not available for future products. Let's use softpin,
>it is simple to do and several examples for this are already merged.
Thanks for sharing the new way to do these things, I will look into it.
>
>> @@ -1015,16 +1450,40 @@ main(int argc, char *argv[])
>> corr_period_ns = corr_period * 1000000000ul;
>> poll_time_ns = corr_period_ns;
>>
>> + if (mmap_buffer) {
>> + ctx.zero_fd = open("/dev/zero", O_RDWR | O_CLOEXEC);
>
>Don't we need /dev/null rather than /dev/zero? Anyway looks unnecessarily
>complicated, just malloc a buffer and read repeatedly into it?
For this case, /dev/null should work too. Did not use malloc to avoid
the allocation and reading OA data into actual memory. In the case of
zero/null device, there's no backing memory, so I thought it's faster to
drain.
Thanks,
Umesh
More information about the igt-dev
mailing list