[Intel-gfx] [PATCH 06/11] drm/i915: Framework for capturing command stream based OA reports

sourab gupta sourab.gupta at intel.com
Fri Feb 19 06:51:13 UTC 2016


On Wed, 2016-02-17 at 23:00 +0530, Robert Bragg wrote:
> Hi Sourab,
> 
> 
> As Sergio Martinez has started experimenting with this in gputop and
> reported seeing lots of ENOSPC errors being reported when reading I
> had a look into this and saw a few issues with how we check that
> there's data available to read in command stream mode, and a I think
> there's a possibility of incorrectly sorting the samples sometimes...

Hi Robert,
Thanks for spotting this anomaly. I'll have this fixed in the next
version of patch set.
> 
> On Tue, Feb 16, 2016 at 5:27 AM, <sourab.gupta at intel.com> wrote:
>         From: Sourab Gupta <sourab.gupta at intel.com>
>         
>         
>         -static bool i915_oa_can_read(struct i915_perf_stream *stream)
>         +static bool append_oa_rcs_sample(struct i915_perf_stream
>         *stream,
>         +                                struct i915_perf_read_state
>         *read_state,
>         +                                struct i915_perf_cs_data_node
>         *node)
>         +{
>         +       struct drm_i915_private *dev_priv = stream->dev_priv;
>         +       struct oa_sample_data data = { 0 };
>         +       const u8 *report =
>         dev_priv->perf.command_stream_buf.addr +
>         +                               node->offset;
>         +       u32 sample_flags = stream->sample_flags;
>         +       u32 report_ts;
>         +
>         +       /*
>         +        * Forward the periodic OA samples which have the
>         timestamp lower
>         +        * than timestamp of this sample, before forwarding
>         this sample.
>         +        * This ensures samples read by user are order acc. to
>         their timestamps
>         +        */
>         +       report_ts = *(u32 *)(report + 4);
>         +       dev_priv->perf.oa.ops.read(stream, read_state,
>         report_ts);
>         +
>         +       if (sample_flags & SAMPLE_OA_SOURCE_INFO)
>         +               data.source = I915_PERF_OA_EVENT_SOURCE_RCS;
>         +
>         +       if (sample_flags & SAMPLE_CTX_ID)
>         +               data.ctx_id = node->ctx_id;
>         +
>         +       if (sample_flags & SAMPLE_OA_REPORT)
>         +               data.report = report;
>         +
>         +       append_oa_sample(stream, read_state, &data);
>         +
>         +       return true;
>         +}
>         +
>         +static void oa_rcs_append_reports(struct i915_perf_stream
>         *stream,
>         +                                 struct i915_perf_read_state
>         *read_state)
>         +{
>         +       struct drm_i915_private *dev_priv = stream->dev_priv;
>         +       struct i915_perf_cs_data_node *entry, *next;
>         +
>         +       list_for_each_entry_safe(entry, next,
>         +                                &dev_priv->perf.node_list,
>         link) {
>         +               if (!
>         i915_gem_request_completed(entry->request, true))
>         +                       break;
>         +
>         +               if (!append_oa_rcs_sample(stream, read_state,
>         entry))
>         +                       break;
>         +
>         +               spin_lock(&dev_priv->perf.node_list_lock);
>         +               list_del(&entry->link);
>         +               spin_unlock(&dev_priv->perf.node_list_lock);
>         +
>         +
>          i915_gem_request_unreference__unlocked(entry->request);
>         +               kfree(entry);
>         +       }
>         +
>         +       /* Flush any remaining periodic reports */
>         +       dev_priv->perf.oa.ops.read(stream, read_state,
>         U32_MAX);
>  
> I don't think we can flush all remaining periodic reports here - at
> least not if we have any in-flight MI_RPC commands - in case the next
> request to complete might have reports with earlier timestamps than
> some of these periodic reports.
> 
> 
> Even if we have periodic reports available I think we need to throttle
> forwarding them based on the command stream requests completing.
> 
> 
> This is something that userspace should understand when it explicitly
> decides to use command stream mode in conjunction with periodic
> sampling.
> 
I agree, there shouldn't be any flushing of remaining periodic reports
here, instead any periodic reports remaining here should be taken care
of during the next processing of command stream samples.
>  
>         +}
>         +
>         +static bool command_stream_buf_is_empty(struct
>         i915_perf_stream *stream)
>          {
>                 struct drm_i915_private *dev_priv = stream->dev_priv;
>         
>         -       return !
>         dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv);
>         +       if (stream->cs_mode)
>         +               return list_empty(&dev_priv->perf.node_list);
>         +       else
>         +               return true;
>          }
> 
> 
> I think this list_empty() check needs a lock around it, as it's called
> from stream_have_data__unlocked().
> 
> 
> Currently only checking list_empty() can lead to some false positives
> that may confuse the current i915_perf_read() (the false positives
> come from not checking that at least one entry->request has also
> completed).
> 
> False positives here will affect the error reporting in
> i915_perf_read_locked(). The first thing i915_perf_read_locked() does
> is check for available data so it can return -EAGAIN for non-blocking
> file descriptors or block, waiting for data, for blocking fds. Once we
> believe there is data then if stream->read() returns zero then that's
> translated to a -ENOSPC error because we 'know' there's data but
> weren't able to copy anything to the user's buffer because it wasn't
> large enough to hold a complete record.
> 
Thanks for pointing this out. This function should check that at least
one request has completed when the list is not empty. I'll aim to make
this change.
>  
>         
>         -static int i915_oa_wait_unlocked(struct i915_perf_stream
>         *stream)
>         +static bool stream_have_data__unlocked(struct
>         i915_perf_stream *stream)
>          {
>                 struct drm_i915_private *dev_priv = stream->dev_priv;
>         
>         @@ -397,8 +739,29 @@ static int i915_oa_wait_unlocked(struct
>         i915_perf_stream *stream)
>                  * can't be destroyed until completion (such as a
>         read()) that ensures
>                  * the device + OA buffer can't disappear
>                  */
>         +
>          return !(dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv)
>         &&
>         +                command_stream_buf_is_empty(stream));
>         +}
> 
> 
> In command stream mode, due to how we need to merge sort reports I
> think we need a more involved set of checks. We need to check that at
> least one entry->request in the node_list has completed, or else if
> the node_list is empty then we can check if !oa_buffer_is_empty().
> 
> 
> There's going to be a thorny race to consider though: if this code
> ends up reporting that there is data to read because the node_list is
> empty and there are periodic samples, but by the time we call into
> stream->read() there's a new node_list entry that's not completed that
> will end up returning 0 which could again result in an incorrect
> ENOSPC error.
> 
> We should consider changing stream->read() so it can return -ENOSPC
> itself if appropriate and when it returns 0 we would instead either
> return -EAGAIN or loop back to waiting for blocking fds.

Yeah, this race needs to be handled, ideally best way would be through
the changes to stream->read() itself as you suggested. I'll aim to make
this change (while rebasing on your recent copy_to_user changes).

> B.t.w for reference the stream->read() interface is a little different
> in the series I sent out recently for Haswell vs the gen8+ oa-next
> branch (which I haven't rebased yet) because I had to make some last
> minute changes to consider -EFAULT errors if there was a
> copy_to_user() error while reading. If you look at this issue, it
> could be good to also compare with these related changes I made
> recently.
>  
>         +
>         +static bool i915_oa_can_read(struct i915_perf_stream *stream)
>         +{
>         +
>         +       return stream_have_data__unlocked(stream);
>         +}
>         
> 
> 
> Kind regards,
> 
> - Robert
> 




More information about the Intel-gfx mailing list