[PATCH] drm/xe: Decrement client count immediately on file close
Vivekanandan, Balasubramani
balasubramani.vivekanandan at intel.com
Wed Sep 18 10:03:07 UTC 2024
On 18.09.2024 15:09, Upadhyay, Tejas wrote:
>
>
> > -----Original Message-----
> > From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of
> > Balasubramani Vivekanandan
> > Sent: Wednesday, September 18, 2024 1:42 PM
> > To: intel-xe at lists.freedesktop.org
> > Cc: Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>;
> > Vishwanathapura, Niranjana <niranjana.vishwanathapura at intel.com>; De
> > Marchi, Lucas <lucas.demarchi at intel.com>; Vivekanandan, Balasubramani
> > <balasubramani.vivekanandan at intel.com>
> > Subject: [PATCH] drm/xe: Decrement client count immediately on file close
> >
> > Decrement the client count immediately on file close. It is not required to be
> > deferred to the resource cleanup function. Otherwise there will be a small
> > time window, where there will be a non-zero client count even after closing all
> > open file handles.
> > This affects ccs_mode(xe_compute) igt tests as these tests try to change the
> > ccs_mode immediately after closing all file handles, but the driver rejects the
> > ccs_mode change request as it sees a non-zero client count.
> >
> > Fixes: ce8c161cbad4 ("drm/xe: Add ref counting for xe_file")
> > Signed-off-by: Balasubramani Vivekanandan
> > <balasubramani.vivekanandan at intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_device.c | 9 ++++-----
> > 1 file changed, 4 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index 4d3c794f134c..3bccea6212ff 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -107,17 +107,12 @@ static int xe_file_open(struct drm_device *dev,
> > struct drm_file *file) static void xe_file_destroy(struct kref *ref) {
> > struct xe_file *xef = container_of(ref, struct xe_file, refcount);
> > - struct xe_device *xe = xef->xe;
> >
> > xa_destroy(&xef->exec_queue.xa);
> > mutex_destroy(&xef->exec_queue.lock);
> > xa_destroy(&xef->vm.xa);
> > mutex_destroy(&xef->vm.lock);
> >
> > - spin_lock(&xe->clients.lock);
> > - xe->clients.count--;
> > - spin_unlock(&xe->clients.lock);
> > -
> > xe_drm_client_put(xef->client);
> > kfree(xef->process_name);
> > kfree(xef);
> > @@ -178,6 +173,10 @@ static void xe_file_close(struct drm_device *dev,
> > struct drm_file *file)
> >
> > xe_file_put(xef);
> >
> > + spin_lock(&xe->clients.lock);
> > + xe->clients.count--;
> > + spin_unlock(&xe->clients.lock);
>
> The file_close here is sychronus and serialized call with respect to userspace. Any settings done through sysfs post file_close should not required this change as far as I know. Would please explain scenario better?
In the current code, the client count is decremented in the function
xe_file_destroy which is not invoked synchronously from xe_file_close.
It is called when all references to xe_file are lost.
References to xe_file are held during creation of vm and exec_queues. So
the somebody might still be holding reference to xe_file while
xe_file_close is called. Therefore the invocation of xe_file_destroy
might be deferred.
As of result, driver might see a non-zero client count even after all
file handles are actually closed which is incorrect. We can defer only
the freeing of resources to xe_file_destroy but the client count can be
immediately adjusted in xe_file_close.
Regards,
Bala
>
> Tejas
> > +
> > xe_pm_runtime_put(xe);
> > }
> >
> > --
> > 2.34.1
>
More information about the Intel-xe
mailing list