[PATCH v2 2/2] drm/xe: Use the filelist from drm for ccs_mode change

Matthew Brost matthew.brost at intel.com
Sat Oct 12 04:43:16 UTC 2024


On Thu, Oct 10, 2024 at 11:02:56AM -0500, Lucas De Marchi wrote:
> On Thu, Oct 10, 2024 at 04:33:08PM +0530, Vivekanandan, Balasubramani wrote:
> > On 08.10.2024 18:00, Matthew Brost wrote:
> > > On Tue, Oct 08, 2024 at 05:24:32PM +0000, Matthew Brost wrote:
> > > > On Tue, Oct 08, 2024 at 11:55:03AM -0500, Lucas De Marchi wrote:
> > > > > On Tue, Oct 08, 2024 at 03:42:09PM +0000, Matthew Brost wrote:
> > > > > > On Tue, Oct 08, 2024 at 10:20:15AM -0500, Lucas De Marchi wrote:
> > > > > > > On Tue, Oct 08, 2024 at 02:54:19PM +0000, Matthew Brost wrote:
> > > > > > > > On Tue, Oct 08, 2024 at 01:06:28PM +0530, Balasubramani Vivekanandan wrote:
> > > > > > > > > Drop the exclusive client count tracking and use the filelist from the
> > > > > > > > > drm to track the active clients. This also ensures the clients created
> > > > > > > > > internally by the driver won't block changing the ccs mode.
> > > > > > > > >
> > > > > > > > > Fixes: ce8c161cbad4 ("drm/xe: Add ref counting for xe_file")
> > > > > > > >
> > > > > > > > Is this really fixing anything. As far as I can tell nothing upstream
> > > > > > > > opens a file internally (i.e. xe_file_open) is never called directly.
> > > > > > >
> > > > > > > should fix this case:
> > > > > > >
> > > > > > > 	open()
> > > > > > > 	close()
> > > > > > > 			<---- race here
> > > > > > > 	change_ccs_mode()
> > > > > > >
> > > > > > > because the close is not completely sync - the cleanup where the
> > > > > > > previous number of clients is decremented is executed too late and
> > > > > > > subject to a race fixed here.
> > > > > > >
> > > > > >
> > > > > > Ah, ok. But then IMO just move the clients.count decrement to
> > > > > > xe_file_close. I try to preach solid locking / layering and this seems
> > > > > > to go against this idea.
> > > > >
> > > > > why do you want to track the exact same thing in 2 ways, one in drm and
> > > > > the other in xe? "Are there clients connected?" is something drm_file
> > > > > answer and doesn't need to be duplicated in the individual drivers.
> > > > >
> > > >
> > > > Well layering, making assumptions about the file list means, and not
> > > > randomly deciding what we can and can't do with locks we don't own.
> > > >
> > > 
> > > I looked at this a bit more, if we want to do it this way then I think
> > > at a minimum we need a drm helper for this.
> > > 
> > > e.g.,
> > > 
> > > bool drm_device_user_clients_open(struct drm_device *dev)
> > > {
> > > 	lockdep_assert(&dev->filelist_mutex);
> > > 
> > > 	return !!list_empty(&dev->filelist);
> > > }
> > 
> > Still the driver would be required to use the filelist_mutex outside the
> > drm layer. So will not solve the layering problem completely.
> > Otherwise we can define two helper functions in drm, the first one would
> > check there are no open files and returns true holding the mutex.
> > The second function can be used to release the mutex.
> > Does it sound too invasive?

I'd say helper to acquire / release the file lock indicating it is ok
for other layers to use the lock. Then add helpers with annotaions (like
my example) for internal access to that layer. Maybe a bit much, I find
myself writing code in this style these days.

> 
> yes, and I'd rather prefer not adding these helpers.  If someone wants
> to add helpers later and convert the current use cases, that's a
> cleanup/refactor that can be done on top. For me, these helpers are not
> really helping.
> 
> Rodrigo / Thomas, any thought?

Another opinion would be good. I know this seems a bit pedantic but
locking is about the most important thing a driver and certainly the
most difficult to unwind is we do it incorrectly.

Matt

> 
> Lucas De Marchi
> 
> > 
> > Regards,
> > Bala
> > 
> > > 
> > > Matt
> > > 
> > > > > At least it's also used for similar reasons in amdgpu and vmwgfx:
> > > > >
> > > > > $ git grep -l -e "->filelist" -- drivers/gpu/drm/
> > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > > > > drivers/gpu/drm/drm_client.c
> > > > > drivers/gpu/drm/drm_debugfs.c
> > > > > drivers/gpu/drm/drm_drv.c
> > > > > drivers/gpu/drm/drm_file.c
> > > > > drivers/gpu/drm/vmwgfx/vmwgfx_gem.c
> > > > >
> > > >
> > > > The AMD / VMWGFX usage certainly looks wrong to me. If the file list was
> > > > meant to be traversed by drivers there should be an exported iterator
> > > > IMO.
> > > >
> > > > Matt
> > > >
> > > > >
> > > > > Lucas De Marchi


More information about the Intel-xe mailing list