[Intel-xe] [RFC 03/25] drm/xe/eudebug: Introduce exec_queue events

Andi Shyti andi.shyti at linux.intel.com
Thu Nov 23 13:21:07 UTC 2023


Hi Dominik,

> > > +		if (h < 0)
> > > +			return h;
> > > +
> > > +		h_lrc[i] = h;
> > > +	}
> > > +
> > > +	/* No need to cleanup for added handles on error as if we fail
> > > +	 * we disconnect
> > > +	 */
> > 
> > this comment is misplaced and not properly formatted.
> > 
> > Anyway, for -EEXIST and -ENOTCONN, you don't really fail, I think
> > those need to be checked and treated separately. If -EEXIST, I
> > believe the correct behavior should be 'continue', rather than
> > 'return'.
> 
> Isn't that already handled in xe_eudebug_exec_queue_create? In fact, 
> if resource exists we do continue, we just do not send an event twice.

no, you return. if (h == -EEXIST) return -EEXIST;

-EEXIST is treated at the same way as -ENOTCONN, which most
probably means that the queue is full.

> You actually made me think if we shouldn't be more radical and disconnect if sth 
> already exists in our resource tracking. 

nah... I don't think so. If things go wrong let them fail on
their own, don't kill by speculation.

> > > +	return send_exec_queue_event(d, DRM_XE_EUDEBUG_EVENT_CREATE,
> > > +				     h_c, h_vm, h_queue, (u16)q->class,
> > > +				     q->width, h_lrc);
> > > +}
> > > +
> > > +static int exec_queue_destroy_event(struct xe_eudebug *d,
> > > +				    struct xe_file *xef, struct xe_exec_queue *q)
> > > +{
> > > +	int h_c, h_vm, h_queue;
> > > +	u64 h_lrc[XE_HW_ENGINE_MAX_INSTANCE];
> > > +	int i;
> > > +
> > > +	h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef);
> > > +	if (h_c < 0)
> > > +		return h_c;
> > > +
> > > +	h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, q->vm);
> > > +	if (h_vm < 0)
> > > +		return h_vm;
> > > +
> > > +	h_queue = xe_eudebug_remove_handle(d,
> > > +					   XE_EUDEBUG_RES_TYPE_EXEC_QUEUE,
> > > +					   q);
> > > +	if (h_queue < 0)
> > > +		return h_queue;
> > > +
> > > +	for (i = 0; i < q->width; i++) {
> > > +		int h = xe_eudebug_remove_handle(d,
> > > +						 XE_EUDEBUG_RES_TYPE_LRC,
> > > +						 &q->lrc[i]);
> > > +
> > > +		if (h < 0)
> > > +			return h;
> > 
> > why not continue?
> > 
> > Andi
> Because we can't send valid destroy event, so we disconnect. If we end up in situation where there
> is exec_queue with lrc missing, it means we are already in an invalid state. 

So that if one handle failes to be removed we disconnect the
debugger and clear everything. Right?

Andi


More information about the Intel-xe mailing list