[PATCH] drm/xe: Decrement client count immediately on file close

Vivekanandan, Balasubramani balasubramani.vivekanandan at intel.com
Wed Sep 18 10:03:07 UTC 2024


On 18.09.2024 15:09, Upadhyay, Tejas wrote:
> 
> 
> > -----Original Message-----
> > From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of
> > Balasubramani Vivekanandan
> > Sent: Wednesday, September 18, 2024 1:42 PM
> > To: intel-xe at lists.freedesktop.org
> > Cc: Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>;
> > Vishwanathapura, Niranjana <niranjana.vishwanathapura at intel.com>; De
> > Marchi, Lucas <lucas.demarchi at intel.com>; Vivekanandan, Balasubramani
> > <balasubramani.vivekanandan at intel.com>
> > Subject: [PATCH] drm/xe: Decrement client count immediately on file close
> > 
> > Decrement the client count immediately on file close. It is not required to be
> > deferred to the resource cleanup function. Otherwise there will be a small
> > time window, where there will be a non-zero client count even after closing all
> > open file handles.
> > This affects ccs_mode(xe_compute) igt tests as these tests try to change the
> > ccs_mode immediately after closing all file handles, but the driver rejects the
> > ccs_mode change request as it sees a non-zero client count.
> > 
> > Fixes: ce8c161cbad4 ("drm/xe: Add ref counting for xe_file")
> > Signed-off-by: Balasubramani Vivekanandan
> > <balasubramani.vivekanandan at intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_device.c | 9 ++++-----
> >  1 file changed, 4 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index 4d3c794f134c..3bccea6212ff 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -107,17 +107,12 @@ static int xe_file_open(struct drm_device *dev,
> > struct drm_file *file)  static void xe_file_destroy(struct kref *ref)  {
> >  	struct xe_file *xef = container_of(ref, struct xe_file, refcount);
> > -	struct xe_device *xe = xef->xe;
> > 
> >  	xa_destroy(&xef->exec_queue.xa);
> >  	mutex_destroy(&xef->exec_queue.lock);
> >  	xa_destroy(&xef->vm.xa);
> >  	mutex_destroy(&xef->vm.lock);
> > 
> > -	spin_lock(&xe->clients.lock);
> > -	xe->clients.count--;
> > -	spin_unlock(&xe->clients.lock);
> > -
> >  	xe_drm_client_put(xef->client);
> >  	kfree(xef->process_name);
> >  	kfree(xef);
> > @@ -178,6 +173,10 @@ static void xe_file_close(struct drm_device *dev,
> > struct drm_file *file)
> > 
> >  	xe_file_put(xef);
> > 
> > +	spin_lock(&xe->clients.lock);
> > +	xe->clients.count--;
> > +	spin_unlock(&xe->clients.lock);
> 
> The file_close here is sychronus and serialized call with respect to userspace. Any settings done through sysfs post file_close should not required this change as far as I know. Would please explain scenario better?

In the current code, the client count is decremented in the function
xe_file_destroy which is not invoked synchronously from xe_file_close.
It is called when all references to xe_file are lost.
References to xe_file are held during creation of vm and exec_queues. So
the somebody might still be holding reference to xe_file while
xe_file_close is called. Therefore the invocation of xe_file_destroy
might be deferred. 
As of result, driver might see a non-zero client count even after all
file handles are actually closed which is incorrect. We can defer only
the freeing of resources to xe_file_destroy but the client count can be
immediately adjusted in xe_file_close.

Regards,
Bala
> 
> Tejas
> > +
> >  	xe_pm_runtime_put(xe);
> >  }
> > 
> > --
> > 2.34.1
> 


More information about the Intel-xe mailing list