[Intel-gfx] [PATCH 1/3] drm/i915/perf: Reduce cpu overhead for blocking perf OA reads

Lionel Landwerlin lionel.g.landwerlin at intel.com
Wed Apr 15 19:16:59 UTC 2020


On 15/04/2020 21:55, Umesh Nerlige Ramappa wrote:
> On Wed, Apr 15, 2020 at 01:00:30PM +0300, Lionel Landwerlin wrote:
>> On 13/04/2020 18:48, Umesh Nerlige Ramappa wrote:
>>> A condition in wait_event_interruptible seems to be checked twice 
>>> before
>>> waiting on the event to occur. These checks are redundant when hrtimer
>>> events will call oa_buffer_check_unlocked to update the oa_buffer tail
>>> pointers. The redundant checks add cpu overhead. Simplify the check
>>> to reduce cpu overhead when using blocking io to read oa buffer 
>>> reports.
>>>
>>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>>
>>
>> I cherry picked this patch alone and it seems to break the 
>> disabled-read-error test.
>
> Strange. I don't see it fail on my CFL. I am apply this on the latest 
> drm-tip from yesterday.
>
> The patch still checks if reports are available before blocking. The 
> behavior should still be the same w.r.t this test.
>
> What machine did you run it on? I will try on the same. Any chance you 
> have the debug output from the test?
>
> Thanks,
> Umesh
>

I got that on SKL GT4 : http://paste.debian.net/1140604/


Thanks,


-Lionel


>>
>>
>>> ---
>>>  drivers/gpu/drm/i915/i915_perf.c | 28 +++++++++++++++++++++++++++-
>>>  1 file changed, 27 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c 
>>> b/drivers/gpu/drm/i915/i915_perf.c
>>> index 5cde3e4e7be6..e28a3ab83fde 100644
>>> --- a/drivers/gpu/drm/i915/i915_perf.c
>>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>>> @@ -541,6 +541,32 @@ static bool oa_buffer_check_unlocked(struct 
>>> i915_perf_stream *stream)
>>>      return pollin;
>>>  }
>>> +/**
>>> + * oa_buffer_check_reports - quick check if reports are available
>>> + * @stream: i915 stream instance
>>> + *
>>> + * The return from this function is used as a condition for
>>> + * wait_event_interruptible in blocking read. This is used to detect
>>> + * available reports.
>>> + *
>>> + * A condition in wait_event_interruptible seems to be checked 
>>> twice before
>>> + * waiting on an event to occur. These checks are redundant when 
>>> hrtimer events
>>> + * will call oa_buffer_check_unlocked to update the oa_buffer tail 
>>> pointers. The
>>> + * redundant checks add cpu overhead. We simplify the check to 
>>> reduce cpu
>>> + * overhead.
>>> + */
>>> +static bool oa_buffer_check_reports(struct i915_perf_stream *stream)
>>> +{
>>> +    unsigned long flags;
>>> +    bool available;
>>> +
>>> +    spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>>> +    available = stream->oa_buffer.tail != stream->oa_buffer.head;
>>> + spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
>>> +
>>> +    return available;
>>> +}
>>> +
>>>  /**
>>>   * append_oa_status - Appends a status record to a userspace read() 
>>> buffer.
>>>   * @stream: An i915-perf stream opened for OA metrics
>>> @@ -1150,7 +1176,7 @@ static int i915_oa_wait_unlocked(struct 
>>> i915_perf_stream *stream)
>>>          return -EIO;
>>>      return wait_event_interruptible(stream->poll_wq,
>>> -                    oa_buffer_check_unlocked(stream));
>>> +                    oa_buffer_check_reports(stream));
>>>  }
>>>  /**
>>
>>



More information about the Intel-gfx mailing list