[Intel-gfx] [PATCH 1/3] drm/i915/perf: Reduce cpu overhead for blocking perf OA reads
Lionel Landwerlin
lionel.g.landwerlin at intel.com
Wed Apr 15 19:16:59 UTC 2020
On 15/04/2020 21:55, Umesh Nerlige Ramappa wrote:
> On Wed, Apr 15, 2020 at 01:00:30PM +0300, Lionel Landwerlin wrote:
>> On 13/04/2020 18:48, Umesh Nerlige Ramappa wrote:
>>> A condition in wait_event_interruptible seems to be checked twice
>>> before
>>> waiting on the event to occur. These checks are redundant when hrtimer
>>> events will call oa_buffer_check_unlocked to update the oa_buffer tail
>>> pointers. The redundant checks add cpu overhead. Simplify the check
>>> to reduce cpu overhead when using blocking io to read oa buffer
>>> reports.
>>>
>>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>>
>>
>> I cherry picked this patch alone and it seems to break the
>> disabled-read-error test.
>
> Strange. I don't see it fail on my CFL. I am apply this on the latest
> drm-tip from yesterday.
>
> The patch still checks if reports are available before blocking. The
> behavior should still be the same w.r.t this test.
>
> What machine did you run it on? I will try on the same. Any chance you
> have the debug output from the test?
>
> Thanks,
> Umesh
>
I got that on SKL GT4 : http://paste.debian.net/1140604/
Thanks,
-Lionel
>>
>>
>>> ---
>>> drivers/gpu/drm/i915/i915_perf.c | 28 +++++++++++++++++++++++++++-
>>> 1 file changed, 27 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c
>>> b/drivers/gpu/drm/i915/i915_perf.c
>>> index 5cde3e4e7be6..e28a3ab83fde 100644
>>> --- a/drivers/gpu/drm/i915/i915_perf.c
>>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>>> @@ -541,6 +541,32 @@ static bool oa_buffer_check_unlocked(struct
>>> i915_perf_stream *stream)
>>> return pollin;
>>> }
>>> +/**
>>> + * oa_buffer_check_reports - quick check if reports are available
>>> + * @stream: i915 stream instance
>>> + *
>>> + * The return from this function is used as a condition for
>>> + * wait_event_interruptible in blocking read. This is used to detect
>>> + * available reports.
>>> + *
>>> + * A condition in wait_event_interruptible seems to be checked
>>> twice before
>>> + * waiting on an event to occur. These checks are redundant when
>>> hrtimer events
>>> + * will call oa_buffer_check_unlocked to update the oa_buffer tail
>>> pointers. The
>>> + * redundant checks add cpu overhead. We simplify the check to
>>> reduce cpu
>>> + * overhead.
>>> + */
>>> +static bool oa_buffer_check_reports(struct i915_perf_stream *stream)
>>> +{
>>> + unsigned long flags;
>>> + bool available;
>>> +
>>> + spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>>> + available = stream->oa_buffer.tail != stream->oa_buffer.head;
>>> + spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
>>> +
>>> + return available;
>>> +}
>>> +
>>> /**
>>> * append_oa_status - Appends a status record to a userspace read()
>>> buffer.
>>> * @stream: An i915-perf stream opened for OA metrics
>>> @@ -1150,7 +1176,7 @@ static int i915_oa_wait_unlocked(struct
>>> i915_perf_stream *stream)
>>> return -EIO;
>>> return wait_event_interruptible(stream->poll_wq,
>>> - oa_buffer_check_unlocked(stream));
>>> + oa_buffer_check_reports(stream));
>>> }
>>> /**
>>
>>
More information about the Intel-gfx
mailing list