[Intel-gfx] [PATCH 1/3] drm/i915/perf: Reduce cpu overhead for blocking perf OA reads

Umesh Nerlige Ramappa umesh.nerlige.ramappa at intel.com
Wed Apr 15 18:55:02 UTC 2020


On Wed, Apr 15, 2020 at 01:00:30PM +0300, Lionel Landwerlin wrote:
>On 13/04/2020 18:48, Umesh Nerlige Ramappa wrote:
>>A condition in wait_event_interruptible seems to be checked twice before
>>waiting on the event to occur. These checks are redundant when hrtimer
>>events will call oa_buffer_check_unlocked to update the oa_buffer tail
>>pointers. The redundant checks add cpu overhead. Simplify the check
>>to reduce cpu overhead when using blocking io to read oa buffer reports.
>>
>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>
>
>I cherry picked this patch alone and it seems to break the 
>disabled-read-error test.

Strange. I don't see it fail on my CFL. I am apply this on the latest 
drm-tip from yesterday.

The patch still checks if reports are available before blocking.  The 
behavior should still be the same w.r.t this test.

What machine did you run it on? I will try on the same. Any chance you 
have the debug output from the test?

Thanks,
Umesh

>
>
>>---
>>  drivers/gpu/drm/i915/i915_perf.c | 28 +++++++++++++++++++++++++++-
>>  1 file changed, 27 insertions(+), 1 deletion(-)
>>
>>diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>>index 5cde3e4e7be6..e28a3ab83fde 100644
>>--- a/drivers/gpu/drm/i915/i915_perf.c
>>+++ b/drivers/gpu/drm/i915/i915_perf.c
>>@@ -541,6 +541,32 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>  	return pollin;
>>  }
>>+/**
>>+ * oa_buffer_check_reports - quick check if reports are available
>>+ * @stream: i915 stream instance
>>+ *
>>+ * The return from this function is used as a condition for
>>+ * wait_event_interruptible in blocking read. This is used to detect
>>+ * available reports.
>>+ *
>>+ * A condition in wait_event_interruptible seems to be checked twice before
>>+ * waiting on an event to occur. These checks are redundant when hrtimer events
>>+ * will call oa_buffer_check_unlocked to update the oa_buffer tail pointers. The
>>+ * redundant checks add cpu overhead. We simplify the check to reduce cpu
>>+ * overhead.
>>+ */
>>+static bool oa_buffer_check_reports(struct i915_perf_stream *stream)
>>+{
>>+	unsigned long flags;
>>+	bool available;
>>+
>>+	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>>+	available = stream->oa_buffer.tail != stream->oa_buffer.head;
>>+	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
>>+
>>+	return available;
>>+}
>>+
>>  /**
>>   * append_oa_status - Appends a status record to a userspace read() buffer.
>>   * @stream: An i915-perf stream opened for OA metrics
>>@@ -1150,7 +1176,7 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
>>  		return -EIO;
>>  	return wait_event_interruptible(stream->poll_wq,
>>-					oa_buffer_check_unlocked(stream));
>>+					oa_buffer_check_reports(stream));
>>  }
>>  /**
>
>


More information about the Intel-gfx mailing list