[Intel-gfx] [PATCH 5/7] drm/i915: handle interrupts from the OA unit
Lionel Landwerlin
lionel.g.landwerlin at intel.com
Thu Jan 17 11:43:31 UTC 2019
On 16/01/2019 18:04, Lionel Landwerlin wrote:
> On 16/01/2019 16:31, Chris Wilson wrote:
>> Quoting Lionel Landwerlin (2019-01-16 16:25:26)
>>> On 16/01/2019 16:05, Chris Wilson wrote:
>>>> Quoting Lionel Landwerlin (2019-01-16 15:58:00)
>>>>> On 16/01/2019 15:52, Chris Wilson wrote:
>>>>>> Quoting Lionel Landwerlin (2019-01-16 15:36:20)
>>>>>>> @@ -1877,6 +1883,21 @@ struct drm_i915_private {
>>>>>>> wait_queue_head_t poll_wq;
>>>>>>> bool pollin;
>>>>>>> + /**
>>>>>>> + * Atomic counter incremented by the
>>>>>>> interrupt
>>>>>>> + * handling code for each OA half full
>>>>>>> interrupt
>>>>>>> + * received.
>>>>>>> + */
>>>>>>> + atomic64_t half_full_count;
>>>>>>> +
>>>>>>> + /**
>>>>>>> + * Copy of the atomic half_full_count
>>>>>>> that was last
>>>>>>> + * processed in the i915-perf driver. If
>>>>>>> both counters
>>>>>>> + * differ, there is data available to
>>>>>>> read in the OA
>>>>>>> + * buffer.
>>>>>>> + */
>>>>>>> + u64 half_full_count_last;
>>>>>> Eh? But why a relatively expensive atomic64. You only need one
>>>>>> bit, and
>>>>>> reading the tail pointer from WB memory should just be cheap. You
>>>>>> should
>>>>>> be able to sample the perf ringbuffer pointers very cheaply...
>>>>>> What am I
>>>>>> missing?
>>>>>> -Chris
>>>>>>
>>>>> Fair comment.
>>>>>
>>>>> The thing is with this series there are 2 mechanism that notify
>>>>> the poll_wq.
>>>>>
>>>>> One is the hrtimer that kicks in at regular interval and polls the
>>>>> register with the workaround.
>>>>>
>>>>> The other is the interrupt which doesn't read the registers and
>>>>> workaround.
>>>> What's the complication with the workaround?
>>>
>>> It's a bit more than just looking at registers, we actually have to
>>> look
>>> at the content of the buffer to figure out what landed in memory.
>>>
>>> The register values are not synchronized with the memory writes...
>> I don't want to look at registers at all for polling, and you shouldn't
>> need to since communication is via a ringbuf.
>>> There is a comment in the code (i915_perf_poll_locked) about not
>>> checking the register after each wakeup because that may be a very
>>> hot path.
>>>
>>> The atomic64 sounded like a lesser evil.
>> I'm clearly not understanding something here...
>>
>> Does the hardware not do:
>> update ringbuf data;
>> wmb() (or post to global observation point in their parlance)
>> update ringbuf tail
>
>
> As far as I understand, the OA unit :
>
> sends its memory write requests to the memory controller
>
> immediately updates the ringbuf tail, without waiting for the
> previous requests to complete
>
By experimentation, I've haven't seen a delta between what is available
in memory and what the OA tail register indicate larger than 768 bytes,
which is about 3 OA reports at the largest size.
There is probably a maximum number of write requests the OA unit can
queue before blocking.
So again maybe you would prefer a 2 stage mechanism :
OA interrupt -----> head/tail pointer worker -----> wake up userspace
hrtimer head/tail pointer --|
>
>
>>
>> Then we just need to sample the ringbuf tail and compare against how far
>> we read last time?
>> -Chris
>>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20190117/06b456b1/attachment.html>
More information about the Intel-gfx
mailing list