[PATCH v2 2/2] drm/xe: Use the filelist from drm for ccs_mode change
Lucas De Marchi
lucas.demarchi at intel.com
Fri Nov 1 13:11:05 UTC 2024
On Sat, Oct 12, 2024 at 06:45:57PM -0500, Lucas De Marchi wrote:
>It looks like I forgot to Cc, so let's try again.
>
>Lucas De Marchi
>
>On Sat, Oct 12, 2024 at 04:43:16AM +0000, Matthew Brost wrote:
>>On Thu, Oct 10, 2024 at 11:02:56AM -0500, Lucas De Marchi wrote:
>>>On Thu, Oct 10, 2024 at 04:33:08PM +0530, Vivekanandan, Balasubramani wrote:
>>>> On 08.10.2024 18:00, Matthew Brost wrote:
>>>> > On Tue, Oct 08, 2024 at 05:24:32PM +0000, Matthew Brost wrote:
>>>> > > On Tue, Oct 08, 2024 at 11:55:03AM -0500, Lucas De Marchi wrote:
>>>> > > > On Tue, Oct 08, 2024 at 03:42:09PM +0000, Matthew Brost wrote:
>>>> > > > > On Tue, Oct 08, 2024 at 10:20:15AM -0500, Lucas De Marchi wrote:
>>>> > > > > > On Tue, Oct 08, 2024 at 02:54:19PM +0000, Matthew Brost wrote:
>>>> > > > > > > On Tue, Oct 08, 2024 at 01:06:28PM +0530, Balasubramani Vivekanandan wrote:
>>>> > > > > > > > Drop the exclusive client count tracking and use the filelist from the
>>>> > > > > > > > drm to track the active clients. This also ensures the clients created
>>>> > > > > > > > internally by the driver won't block changing the ccs mode.
>>>> > > > > > > >
>>>> > > > > > > > Fixes: ce8c161cbad4 ("drm/xe: Add ref counting for xe_file")
>>>> > > > > > >
>>>> > > > > > > Is this really fixing anything. As far as I can tell nothing upstream
>>>> > > > > > > opens a file internally (i.e. xe_file_open) is never called directly.
>>>> > > > > >
>>>> > > > > > should fix this case:
>>>> > > > > >
>>>> > > > > > open()
>>>> > > > > > close()
>>>> > > > > > <---- race here
>>>> > > > > > change_ccs_mode()
>>>> > > > > >
>>>> > > > > > because the close is not completely sync - the cleanup where the
>>>> > > > > > previous number of clients is decremented is executed too late and
>>>> > > > > > subject to a race fixed here.
>>>> > > > > >
>>>> > > > >
>>>> > > > > Ah, ok. But then IMO just move the clients.count decrement to
>>>> > > > > xe_file_close. I try to preach solid locking / layering and this seems
>>>> > > > > to go against this idea.
>>>> > > >
>>>> > > > why do you want to track the exact same thing in 2 ways, one in drm and
>>>> > > > the other in xe? "Are there clients connected?" is something drm_file
>>>> > > > answer and doesn't need to be duplicated in the individual drivers.
>>>> > > >
>>>> > >
>>>> > > Well layering, making assumptions about the file list means, and not
>>>> > > randomly deciding what we can and can't do with locks we don't own.
>>>> > >
>>>> >
>>>> > I looked at this a bit more, if we want to do it this way then I think
>>>> > at a minimum we need a drm helper for this.
>>>> >
>>>> > e.g.,
>>>> >
>>>> > bool drm_device_user_clients_open(struct drm_device *dev)
>>>> > {
>>>> > lockdep_assert(&dev->filelist_mutex);
>>>> >
>>>> > return !!list_empty(&dev->filelist);
>>>> > }
>>>>
>>>> Still the driver would be required to use the filelist_mutex outside the
>>>> drm layer. So will not solve the layering problem completely.
>>>> Otherwise we can define two helper functions in drm, the first one would
>>>> check there are no open files and returns true holding the mutex.
>>>> The second function can be used to release the mutex.
>>>> Does it sound too invasive?
>>
>>I'd say helper to acquire / release the file lock indicating it is ok
>>for other layers to use the lock. Then add helpers with annotaions (like
>>my example) for internal access to that layer. Maybe a bit much, I find
>>myself writing code in this style these days.
>>
>>>
>>>yes, and I'd rather prefer not adding these helpers. If someone wants
>>>to add helpers later and convert the current use cases, that's a
>>>cleanup/refactor that can be done on top. For me, these helpers are not
>>>really helping.
>>>
>>>Rodrigo / Thomas, any thought?
>>
>>Another opinion would be good. I know this seems a bit pedantic but
>>locking is about the most important thing a driver and certainly the
>>most difficult to unwind is we do it incorrectly.
I chatted with Thomas today regarding this. He is also of the opinion
that it'd be better to have wrappers in the drm layer. However since
there are already users, it's ok to add this as is and prep a conversion
on top.
From a quick look, we'd need to add:
drm_device_user_clients_lock()
drm_device_user_clients_lock_interruptible()
drm_device_user_clients_unlock()
drm_device_user_clients_empty_locked()
drm_device_user_clients_empty()
and discuss it in dri-devel to convert the current users. Maybe we will
also need some renames or doc updates since drm_device has:
filelist, protected by filelist_mutex
filelist_internal, protected by filelist_mutex
clientlist, clientlist_mutex
which would point to actually add the helpers as
drm_device_filelist_* rather than user_clients
For now, applied both patches to drm-xe-next.
Lucas De Marchi
>>
>>Matt
>>
>>>
>>>Lucas De Marchi
>>>
>>>>
>>>> Regards,
>>>> Bala
>>>>
>>>> >
>>>> > Matt
>>>> >
>>>> > > > At least it's also used for similar reasons in amdgpu and vmwgfx:
>>>> > > >
>>>> > > > $ git grep -l -e "->filelist" -- drivers/gpu/drm/
>>>> > > > drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>>>> > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>> > > > drivers/gpu/drm/drm_client.c
>>>> > > > drivers/gpu/drm/drm_debugfs.c
>>>> > > > drivers/gpu/drm/drm_drv.c
>>>> > > > drivers/gpu/drm/drm_file.c
>>>> > > > drivers/gpu/drm/vmwgfx/vmwgfx_gem.c
>>>> > > >
>>>> > >
>>>> > > The AMD / VMWGFX usage certainly looks wrong to me. If the file list was
>>>> > > meant to be traversed by drivers there should be an exported iterator
>>>> > > IMO.
>>>> > >
>>>> > > Matt
>>>> > >
>>>> > > >
>>>> > > > Lucas De Marchi
More information about the Intel-xe
mailing list