[Intel-gfx] [PATCH 11/15] drm/i915: Implementation of GuC client

Dave Gordon david.s.gordon at intel.com
Fri Jun 19 10:55:09 PDT 2015


On 15/06/15 22:55, Chris Wilson wrote:
> On Mon, Jun 15, 2015 at 07:36:29PM +0100, Dave Gordon wrote:
>> +/* Get valid workqueue item and return it back to offset */
>> +static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>> +{
>> +	struct guc_process_desc *desc;
>> +	void *base;
>> +	u32 size = sizeof(struct guc_wq_item);
>> +	int ret = 0, timeout_counter = 200;
>> +	unsigned long flags;
>> +
>> +	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>> +	desc = base + gc->proc_desc_offset;
>> +
>> +	while (timeout_counter-- > 0) {
>> +		spin_lock_irqsave(&gc->wq_lock, flags);
>> +
>> +		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
>> +				gc->wq_size) >= size, 1);
> 
> What is the point of this loop? Drop the spinlock 200 times? You already
> have a timeout, the loop extends that by a factor or 200. You merely
> allow gazzumping, however I haven't looked at the locking to see what
> you intend to lock (since it is not described at all).
> -Chris

Hmmm .. I didn't write this code, so I'm guessing somewhat; but ...

A 'wq_lock' must lock a 'wq', which from the name of the function is a
workqueue, which is a circular buffer shared between the host and the
GuC, where (like the main ringbuffers) the host (producer) advances the
tail (gc->wq_tail) and the other partner (consumer, in this case the
GuC) advances the head (desc->head).

Presumably the GuC could take many (up to 200) ms to get round to making
space available, in a worst-case scenario where it's busy servicing
interrupts and doing other things.

Now we certainly don't want to spin for up to 200ms with interrupts
disabled, so

    spin_lock_irqsave(&gc->wq_lock, flags);
    ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
                                     gc->wq_size) >= size, *200*);
    spin_unlock_irqrestore(&gc->wq_lock, flags);

would be a bad idea. OTOH I don't think there's any other lock held by
anyone higher up in the callchain, so we /probably do/ need the spinlock
to protect the updating of wq_tail when the wait_for_atomic succeeds.

So yes, I think up-to-200 iterations of polling for freespace for up to
1ms each time is not too unreasonable, given that apparently we have to
poll, at least for now (once the scheduler lands, we will always be able
to predict how much space is available and avoid trying to launch
batches when there isn't enough).

.Dave.


More information about the Intel-gfx mailing list