[Intel-xe] [RFC 03/25] drm/xe/eudebug: Introduce exec_queue events

Grzegorzek, Dominik dominik.grzegorzek at intel.com
Wed Nov 22 17:28:14 UTC 2023


Hi Andi!

On Wed, 2023-11-22 at 16:58 +0100, Andi Shyti wrote:
> Hi Dominik and Mika,
> 
> On Mon, Nov 06, 2023 at 01:18:23PM +0200, Mika Kuoppala wrote:
> > From: Dominik Grzegorzek <dominik.grzegorzek at intel.com>
> > 
> > Introduce exec_queue events.
> 
> I love exec_queues, I love them more when they are properly
> introduced.

Fair.
> 
> > Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek at intel.com>
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> 
> which doesn't prevent you from reviewing this patch.
> 
> ...
> 
> > +static int exec_queue_create_event(struct xe_eudebug *d,
> > +				   struct xe_file *xef, struct xe_exec_queue *q)
> > +{
> > +	int h_c, h_vm, h_queue;
> > +	u64 h_lrc[XE_HW_ENGINE_MAX_INSTANCE];
> > +	int i;
> > +
> > +	h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef);
> > +	if (h_c < 0)
> > +		return h_c;
> > +
> > +	h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, q->vm);
> > +	if (h_vm < 0)
> > +		return h_vm;
> > +
> > +	h_queue = xe_eudebug_add_handle(d, XE_EUDEBUG_RES_TYPE_EXEC_QUEUE, q);
> > +	if (h_queue < 0)
> > +		return h_queue;
> > +
> > +	for (i = 0; i < q->width; i++) {
> > +		int h = xe_eudebug_add_handle(d,
> > +					      XE_EUDEBUG_RES_TYPE_LRC,
> > +					      &q->lrc[i]);
> 
> please, important assignments need to be separate from the
> declaration.
> 
> > +		if (h < 0)
> > +			return h;
> > +
> > +		h_lrc[i] = h;
> > +	}
> > +
> > +	/* No need to cleanup for added handles on error as if we fail
> > +	 * we disconnect
> > +	 */
> 
> this comment is misplaced and not properly formatted.
> 
> Anyway, for -EEXIST and -ENOTCONN, you don't really fail, I think
> those need to be checked and treated separately. If -EEXIST, I
> believe the correct behavior should be 'continue', rather than
> 'return'.

Isn't that already handled in xe_eudebug_exec_queue_create? In fact, 
if resource exists we do continue, we just do not send an event twice.

You actually made me think if we shouldn't be more radical and disconnect if sth 
already exists in our resource tracking. 
> 
> > +	return send_exec_queue_event(d, DRM_XE_EUDEBUG_EVENT_CREATE,
> > +				     h_c, h_vm, h_queue, (u16)q->class,
> > +				     q->width, h_lrc);
> > +}
> > +
> > +static int exec_queue_destroy_event(struct xe_eudebug *d,
> > +				    struct xe_file *xef, struct xe_exec_queue *q)
> > +{
> > +	int h_c, h_vm, h_queue;
> > +	u64 h_lrc[XE_HW_ENGINE_MAX_INSTANCE];
> > +	int i;
> > +
> > +	h_c = find_handle(d->res, XE_EUDEBUG_RES_TYPE_CLIENT, xef);
> > +	if (h_c < 0)
> > +		return h_c;
> > +
> > +	h_vm = find_handle(d->res, XE_EUDEBUG_RES_TYPE_VM, q->vm);
> > +	if (h_vm < 0)
> > +		return h_vm;
> > +
> > +	h_queue = xe_eudebug_remove_handle(d,
> > +					   XE_EUDEBUG_RES_TYPE_EXEC_QUEUE,
> > +					   q);
> > +	if (h_queue < 0)
> > +		return h_queue;
> > +
> > +	for (i = 0; i < q->width; i++) {
> > +		int h = xe_eudebug_remove_handle(d,
> > +						 XE_EUDEBUG_RES_TYPE_LRC,
> > +						 &q->lrc[i]);
> > +
> > +		if (h < 0)
> > +			return h;
> 
> why not continue?
> 
> Andi
Because we can't send valid destroy event, so we disconnect. If we end up in situation where there
is exec_queue with lrc missing, it means we are already in an invalid state. 

Many thanks,
Dominik
> 
> > +		h_lrc[i] = h;
> > +	}
> > +
> > +	return send_exec_queue_event(d, DRM_XE_EUDEBUG_EVENT_DESTROY,
> > +				     h_c, h_vm, h_queue, (u16)q->class,
> > +				     q->width, h_lrc);
> > +}



More information about the Intel-xe mailing list