[Intel-gfx] [PATCH 7/7] drm/i915/perf: add flushing ioctl
Lionel Landwerlin
lionel.g.landwerlin at intel.com
Wed Mar 11 16:35:54 UTC 2020
On 11/03/2020 05:05, Dixit, Ashutosh wrote:
> On Tue, 10 Mar 2020 13:44:30 -0700, Lionel Landwerlin wrote:
>> On 09/03/2020 21:51, Umesh Nerlige Ramappa wrote:
>>> On Wed, Mar 04, 2020 at 09:56:28PM -0800, Dixit, Ashutosh wrote:
>>>> On Wed, 04 Mar 2020 00:52:34 -0800, Lionel Landwerlin wrote:
>>>>> On 04/03/2020 07:48, Dixit, Ashutosh wrote:
>>>>>> On Tue, 03 Mar 2020 14:19:05 -0800, Umesh Nerlige Ramappa wrote:
>>>>>>> From: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>>>>>>
>>>>>>> With the currently available parameters for the i915-perf stream,
>>>>>>> there are still situations that are not well covered :
>>>>>>>
>>>>>>> If an application opens the stream with polling disable or at very
>>>>> low
>>>>>>> frequency and OA interrupt enabled, no data will be available even
>>>>>>> though somewhere between nothing and half of the OA buffer worth of
>>>>>>> data might have landed in memory.
>>>>>>>
>>>>>>> To solve this issue we have a new flush ioctl on the perf stream
>>>>> that
>>>>>>> forces the i915-perf driver to look at the state of the buffer when
>>>>>>> called and makes any data available through both poll() & read()
>>>>> type
>>>>>>> syscalls.
>>>>>>>
>>>>>>> v2: Version the ioctl (Joonas)
>>>>>>> v3: Rebase (Umesh)
>>>>>>>
>>>>>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>>>>>> Signed-off-by: Umesh Nerlige Ramappa
>>>>> <umesh.nerlige.ramappa at intel.com>
>>>>>> [snip]
>>>>>>
>>>>>>> +/**
>>>>>>> + * i915_perf_flush_data - handle `I915_PERF_IOCTL_FLUSH_DATA` ioctl
>>>>>>> + * @stream: An enabled i915 perf stream
>>>>>>> + *
>>>>>>> + * The intention is to flush all the data available for reading
>>>>> from the OA
>>>>>>> + * buffer
>>>>>>> + */
>>>>>>> +static void i915_perf_flush_data(struct i915_perf_stream *stream)
>>>>>>> +{
>>>>>>> + stream->pollin = oa_buffer_check(stream, true);
>>>>>>> +}
>>>>>> Since this function doesn't actually wake up any thread (which anyway
>>>>> can
>>>>>> be done by sending a signal to the blocked thread), is the only
>>>>> purpose of
>>>>>> this function to update OA buffer head/tail? But in that it is not
>>>>> clear
>>>>>> why a separate ioctl should be created for this, can't the read()
>>>>> call
>>>>>> itself call oa_buffer_check() to update the OA buffer head/tail?
>>>>>>
>>>>>> Again just trying to minimize uapi changes if possible.
>>>>> Most applications will call read() after being notified by
>>>>> poll()/select()
>>>>> that some data is available.
>>>> Correct this is the standard non blocking read behavior.
>>>>
>>>>> Changing that behavior will break some of the existing perf tests .
>>>> I am not suggesting changing that (that standard non blocking read
>>>> behavior).
>>>>
>>>>> If any data is available, this new ioctl will wake up existing waiters
>>>>> on
>>>>> poll()/select().
>>>> The issue is we are not calling wake_up() in the above function to wake
>>>> up
>>>> any blocked waiters. The ioctl will just update the OA buffer head/tail
>>>> so
>>>> that (a) a subsequent blocking read will not block, or (b) a subsequent
>>>> non
>>>> blocking read will return valid data (not -EAGAIN), or (c) a poll/select
>>>> will not block but return immediately saying data is available.
>>>>
>>>> That is why it seems to me the ioctl is not required, updating the OA
>>>> buffer head/tail can be done as part of the read() (and the poll/select)
>>>> calls themselves.
>>>>
>>>> We will investigate if this can be done and update the patches in the
>>>> next
>>>> revision accordingly. Thanks!
>>> In this case, where we are trying to determine if there is any data in
>>> the oa buffer before the next interrupt has fired, user could call poll
>>> with a reasonable timeout to determine if data is available or not. That
>>> would eliminate the need for the flush ioctl. Thoughts?
>>>
>>> Thanks,
>>> Umesh
>>
>> I almost forgot why this would cause problem.
>>
>> Checking the state of the buffer every time you call poll() will pretty
>> much guarantee you have at least one report to read every time.
>>
>> So that would lead to lot more wakeups :(
>>
>> The whole system has to stay "unidirectional" with either interrupts or
>> timeout driving the wakeups.
>>
>> This additional ioctl is the only solution I could find to add one more
>> input to the wakeup mechanism.
> Well, aren't we asking the app to sleep for time T and then call flush
> (followed by read)? Then we might as well ask them to sleep for time T and
> call poll? Or we can ask them set the hrtimer to T, skip the sleep and call
> poll (followed by read)? Aren't these 3 mechanisms equivalent? To me the
> last option seems to be the cleanest. Thanks!
I guess one thing that could work is to call oa_buffer_check() at the
top of i915_oa_read().
Maybe you proposed this in a previous email.
That way poll() doesn't trigger too many wake ups and right before
closing the FD will pull the remaining data.
One side effect of this is that read() might almost always return
something because the HW is faster to write the OA buffer than the app
can call read().
Still I think it still works out.
What do you think?
-Lionel
More information about the Intel-gfx
mailing list