[igt-dev] [PATCH 4/5] tools/i915-perf: Add mmapped OA buffer support to i915-perf-recorder

Umesh Nerlige Ramappa umesh.nerlige.ramappa at intel.com
Tue Aug 24 19:14:06 UTC 2021


On Mon, Aug 23, 2021 at 06:05:05PM -0700, Dixit, Ashutosh wrote:
>On Tue, 03 Aug 2021 13:07:36 -0700, Umesh Nerlige Ramappa wrote:
>>
>> Currently report from OA buffer are read from the perf_fd. The kernel
>> patches enable mmaping the OA buffer into user space to allow for faster
>> report queries across different platforms and engines.
>>
>> Enable OA buffer to be mmaped by the recorder tool based on command line
>> option -M.
>
>Not completely reviewed yet but some changes are needed, please see below.
>
>> +static int gem_set_domain(int fd, uint32_t handle, uint32_t read, uint32_t write)
>> +{
>> +	struct drm_i915_gem_set_domain set_domain = {
>> +		.handle = handle,
>> +		.read_domains = read,
>> +		.write_domain = write,
>> +	};
>> +	int ret = 0;
>> +
>> +	if (perf_ioctl(fd, DRM_IOCTL_I915_GEM_SET_DOMAIN, &set_domain))
>
>set_domain is not available for discrete, see IGT gem_set_domain().
>
>> +static void *gem_mmap_cpu(int fd, uint32_t handle, uint64_t offset, uint64_t size,
>> +			  unsigned int prot)
>> +{
>> +	struct drm_i915_gem_mmap arg = {
>> +		.handle = handle,
>> +		.offset = offset,
>> +		.size = size,
>> +		.addr_ptr = 0,
>> +		.flags = 0,
>> +	};
>> +
>> +	if (perf_ioctl(fd, DRM_IOCTL_I915_GEM_MMAP, &arg))
>
>This needs to be changed to mmap_offset, DRM_IOCTL_I915_GEM_MMAP has been
>discontinued for future products.
>
>> +static void
>> +bb_emit_srm(struct bb_context *bb, uint32_t reg, uint32_t devid)
>> +{
>> +	bool gen8_plus = devid >= 8;
>> +
>> +	assert(bb->reloc_idx < ARRAY_SIZE(bb->reloc));
>> +	assert(bb->offset < BATCH_SIZE);
>> +
>> +	bb->batch[bb->offset++] = gen8_plus ? MI_STORE_REGISTER_MEM_GEN8 :
>> +					      MI_STORE_REGISTER_MEM;
>> +	bb->batch[bb->offset++] = reg;
>> +
>> +	bb->reloc[bb->reloc_idx].target_handle = bb->obj[0].handle;
>> +	bb->reloc[bb->reloc_idx].presumed_offset = bb->obj[0].offset;
>> +	bb->reloc[bb->reloc_idx].offset = bb->offset * sizeof(uint32_t);
>> +	bb->reloc[bb->reloc_idx].delta = bb->reloc_idx * sizeof(uint32_t);
>> +	bb->reloc[bb->reloc_idx].read_domains = I915_GEM_DOMAIN_RENDER;
>> +	bb->reloc[bb->reloc_idx].write_domain = I915_GEM_DOMAIN_RENDER;
>> +
>> +	bb->batch[bb->offset++] = bb->reloc[bb->reloc_idx].delta;
>> +	if (gen8_plus)
>> +		bb->batch[bb->offset++] = 0;
>
>Relocations are also not available for future products. Let's use softpin,
>it is simple to do and several examples for this are already merged.

Thanks for sharing the new way to do these things, I will look into it.

>
>> @@ -1015,16 +1450,40 @@ main(int argc, char *argv[])
>>	corr_period_ns = corr_period * 1000000000ul;
>>	poll_time_ns = corr_period_ns;
>>
>> +	if (mmap_buffer) {
>> +		ctx.zero_fd = open("/dev/zero", O_RDWR | O_CLOEXEC);
>
>Don't we need /dev/null rather than /dev/zero? Anyway looks unnecessarily
>complicated, just malloc a buffer and read repeatedly into it?

For this case, /dev/null should work too. Did not use malloc to avoid 
the allocation and reading OA data into actual memory. In the case of 
zero/null device, there's no backing memory, so I thought it's faster to 
drain.

Thanks,
Umesh


More information about the igt-dev mailing list