[Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query
Umesh Nerlige Ramappa
umesh.nerlige.ramappa at intel.com
Fri Jul 24 16:29:56 UTC 2020
On Fri, Jul 24, 2020 at 01:42:33PM +0100, Chris Wilson wrote:
>Quoting Umesh Nerlige Ramappa (2020-07-24 01:19:01)
>> From: Piotr Maciejewski <piotr.maciejewski at intel.com>
>>
>> i915 used to support time based sampling mode which is good for overall
>> system monitoring, but is not enough for query mode used to measure a
>> single draw call or dispatch. Gen9-Gen11 are using current i915 perf
>> implementation for query, but Gen12+ requires a new approach for query
>> based on triggered reports within oa buffer.
>>
>> Triggering reports into the OA buffer is achieved by writing into a
>> a trigger register. Optionally an unused counter/register is set with a
>> marker value such that a triggered report can be identified in the OA
>> buffer. Reports are usually triggered at the start and end of work that
>> is measured.
>>
>> Since OA buffer is large and queries can be frequent, an efficient way
>> to look for triggered reports is required. By knowing the current head
>> and tail offsets into the OA buffer, it is easier to determine the
>> locality of the reports of interest.
>>
>> Current perf OA interface does not expose head/tail information to the
>> user and it filters out invalid reports before sending data to user.
>> Also considering limited size of user buffer used during a query,
>> creating a 1:1 copy of the OA buffer at the user space added undesired
>> complexity.
>>
>> The solution was to map the OA buffer to user space provided
>>
>> (1) that it is accessed from a privileged user.
>> (2) OA report filtering is not used.
>>
>> These 2 conditions would satisfy the safety criteria that the current
>> perf interface addresses.
>>
>> To enable the query:
>> - Add an ioctl to expose head and tail to the user
>> - Add an ioctl to return size and offset of the OA buffer
>> - Map the OA buffer to the user space
>>
>> v2:
>> - Improve commit message (Chris)
>> - Do not mmap based on gem object filp. Instead, use perf_fd and support
>> mmap syscall (Chris)
>> - Pass non-zero offset in mmap to enforce the right object is
>> mapped (Chris)
>> - Do not expose gpu_address (Chris)
>> - Verify start and length of vma for page alignment (Lionel)
>> - Move SQNTL config out (Lionel)
>>
>> v3: (Chris)
>> - Omit redundant checks
>> - Return VM_FAULT_SIGBUS is old stream is closed
>> - Maintain reference counts to stream in vm_open and vm_close
>> - Use switch to identify object to be mapped
>>
>> v4: Call kref_put on closing perf fd (Chris)
>> v5:
>> - Strip access to OA buffer from unprivileged child of a privileged
>> parent. Use VM_DONTCOPY
>> - Enforce MAP_PRIVATE by checking for VM_MAYSHARE
>>
>> Signed-off-by: Piotr Maciejewski <piotr.maciejewski at intel.com>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>> ---
>> @@ -3314,12 +3427,113 @@ static int i915_perf_release(struct inode *inode, struct file *file)
>> i915_perf_destroy_locked(stream);
>> mutex_unlock(&perf->lock);
>>
>> + unmap_mapping_range(file->f_mapping, 0, OA_BUFFER_SIZE, 1);
>
>You can just used unmap_mapping_range(file->f_mapping, 0, -1, 1);
>It scales with the number of vma present, so no worries, be conservative.
>(Otherwise, you need s/0/OA_BUFFER_OFFSET/.)
>
>> +
>> /* Release the reference the perf stream kept on the driver. */
>> drm_dev_put(&perf->i915->drm);
>>
>> return 0;
>> }
>>
>> +static void vm_open_oa(struct vm_area_struct *vma)
>> +{
>> + struct i915_perf_stream *stream = vma->vm_private_data;
>> +
>> + GEM_BUG_ON(!stream);
>> + perf_stream_get(stream);
>> +}
>> +
>> +static void vm_close_oa(struct vm_area_struct *vma)
>> +{
>> + struct i915_perf_stream *stream = vma->vm_private_data;
>> +
>> + GEM_BUG_ON(!stream);
>> + perf_stream_put(stream);
>> +}
>> +
>> +static vm_fault_t vm_fault_oa(struct vm_fault *vmf)
>> +{
>> + struct vm_area_struct *vma = vmf->vma;
>> + struct i915_perf_stream *stream = vma->vm_private_data;
>> + struct i915_perf *perf = stream->perf;
>> + struct drm_i915_gem_object *obj = stream->oa_buffer.vma->obj;
>> + int err;
>> + bool closed;
>
>So vm_area_struct has a reference to the stream, that looks good now.
>But there's no reference held to the vma itself.
How do I get a reference to the vma.
>
>> + mutex_lock(&perf->lock);
>> + closed = READ_ONCE(stream->closed);
>> + mutex_unlock(&perf->lock);
>
>We do WRITE_ONCE(stream->closed, true) then invalidate all the mappings,
>so that part looks good. The invalidate is serialised with the
>vm_fault_oa, so we can just use a plain READ_ONCE(stream->closed) here
>and not worry about the perf->lock.
will do
>
>However... I think it should close&invalidate before releasing
>stream->oa_buffer.
will do
>
>And the read here of stream->oa_buffer should be after checking
>stream->closed.
I don't understand. I am checking for closed before remap_io_sg.
Thanks,
Umesh
>
>> +
>> + if (closed)
>> + return VM_FAULT_SIGBUS;
>> +
>> + err = i915_gem_object_pin_pages(obj);
>> + if (err)
>> + goto out;
>> +
>> + err = remap_io_sg(vma,
>> + vma->vm_start, vma->vm_end - vma->vm_start,
>> + obj->mm.pages->sgl, -1);
>> +
>> + i915_gem_object_unpin_pages(obj);
>
>I'd be really tempted here not to use stream->oa_buffer.vma->obj at all.
>We know the oa_buffer is pinned while it is open (and we know we will be
>serialised against close/destroy), so this can be reduced to just
>
> err = remap_io_sg(vma,
> vma->vm_start, vma->vm_end - vma->vm_start,
> stream->oa_buffer.vma->pages, -1);
>
>> +
>> +out:
>> + return i915_error_to_vmf_fault(err);
>> +}
>> +
>> +static const struct vm_operations_struct vm_ops_oa = {
>> + .open = vm_open_oa,
>> + .close = vm_close_oa,
>> + .fault = vm_fault_oa,
>> +};
>> +
>> +int i915_perf_mmap(struct file *file, struct vm_area_struct *vma)
>> +{
>> + struct i915_perf_stream *stream = file->private_data;
>> +
>> + /* mmap-ing OA buffer to user space MUST absolutely be privileged */
>> + if (i915_perf_stream_paranoid && !perfmon_capable()) {
>> + DRM_DEBUG("Insufficient privileges to map OA buffer\n");
>> + return -EACCES;
>> + }
>> +
>> + switch (vma->vm_pgoff) {
>> + /* A non-zero offset ensures that we are mapping the right object. Also
>> + * leaves room for future objects added to this implementation.
>> + */
>
>/*
> * Kernel block comment style.
> */
>
>(There's a few subsystems, i.e net/ that use the other style, but we've
>switch over to the more common open style and try to stick to it.)
>
>> + case I915_PERF_OA_BUFFER_MMAP_OFFSET:
>> + if (vma->vm_end - vma->vm_start > OA_BUFFER_SIZE)
>> + return -EINVAL;
>> +
>> + /* Only support VM_READ. Enforce MAP_PRIVATE by checking for
>> + * VM_MAYSHARE.
>> + */
>> + if (vma->vm_flags & (VM_WRITE | VM_EXEC |
>> + VM_SHARED | VM_MAYSHARE))
>> + return -EINVAL;
>> +
>> + vma->vm_flags &= ~(VM_MAYWRITE | VM_MAYEXEC);
>> +
>> + /* If the privileged parent forks and child drops root
>> + * privilege, we do not want the child to retain access to the
>> + * mapped OA buffer. Explicitly set VM_DONTCOPY to avoid such
>> + * cases.
>> + */
>
>The explanations are extremely valuable :)
>
>> + vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND |
>> + VM_DONTDUMP | VM_DONTCOPY;
>> + break;
>> +
>> + default:
>> + return -EINVAL;
>> + }
>> +
>> + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
>> + vma->vm_private_data = stream;
>> + vma->vm_ops = &vm_ops_oa;
>> + vm_open_oa(vma);
>> +
>> + return 0;
>> +}
More information about the Intel-gfx
mailing list