[PATCH v2 2/2] drm/xe: Use the filelist from drm for ccs_mode change

Thu Oct 10 16:02:56 UTC 2024

On Thu, Oct 10, 2024 at 04:33:08PM +0530, Vivekanandan, Balasubramani wrote:
>On 08.10.2024 18:00, Matthew Brost wrote:
>> On Tue, Oct 08, 2024 at 05:24:32PM +0000, Matthew Brost wrote:
>> > On Tue, Oct 08, 2024 at 11:55:03AM -0500, Lucas De Marchi wrote:
>> > > On Tue, Oct 08, 2024 at 03:42:09PM +0000, Matthew Brost wrote:
>> > > > On Tue, Oct 08, 2024 at 10:20:15AM -0500, Lucas De Marchi wrote:
>> > > > > On Tue, Oct 08, 2024 at 02:54:19PM +0000, Matthew Brost wrote:
>> > > > > > On Tue, Oct 08, 2024 at 01:06:28PM +0530, Balasubramani Vivekanandan wrote:
>> > > > > > > Drop the exclusive client count tracking and use the filelist from the
>> > > > > > > drm to track the active clients. This also ensures the clients created
>> > > > > > > internally by the driver won't block changing the ccs mode.
>> > > > > > >
>> > > > > > > Fixes: ce8c161cbad4 ("drm/xe: Add ref counting for xe_file")
>> > > > > >
>> > > > > > Is this really fixing anything. As far as I can tell nothing upstream
>> > > > > > opens a file internally (i.e. xe_file_open) is never called directly.
>> > > > >
>> > > > > should fix this case:
>> > > > >
>> > > > > 	open()
>> > > > > 	close()
>> > > > > 			<---- race here
>> > > > > 	change_ccs_mode()
>> > > > >
>> > > > > because the close is not completely sync - the cleanup where the
>> > > > > previous number of clients is decremented is executed too late and
>> > > > > subject to a race fixed here.
>> > > > >
>> > > >
>> > > > Ah, ok. But then IMO just move the clients.count decrement to
>> > > > xe_file_close. I try to preach solid locking / layering and this seems
>> > > > to go against this idea.
>> > >
>> > > why do you want to track the exact same thing in 2 ways, one in drm and
>> > > the other in xe? "Are there clients connected?" is something drm_file
>> > > answer and doesn't need to be duplicated in the individual drivers.
>> > >
>> >
>> > Well layering, making assumptions about the file list means, and not
>> > randomly deciding what we can and can't do with locks we don't own.
>> >
>>
>> I looked at this a bit more, if we want to do it this way then I think
>> at a minimum we need a drm helper for this.
>>
>> e.g.,
>>
>> bool drm_device_user_clients_open(struct drm_device *dev)
>> {
>> 	lockdep_assert(&dev->filelist_mutex);
>>
>> 	return !!list_empty(&dev->filelist);
>> }
>
>Still the driver would be required to use the filelist_mutex outside the
>drm layer. So will not solve the layering problem completely.
>Otherwise we can define two helper functions in drm, the first one would
>check there are no open files and returns true holding the mutex.
>The second function can be used to release the mutex.
>Does it sound too invasive?

yes, and I'd rather prefer not adding these helpers.  If someone wants
to add helpers later and convert the current use cases, that's a
cleanup/refactor that can be done on top. For me, these helpers are not
really helping.

Rodrigo / Thomas, any thought?

Lucas De Marchi

>
>Regards,
>Bala
>
>>
>> Matt
>>
>> > > At least it's also used for similar reasons in amdgpu and vmwgfx:
>> > >
>> > > $ git grep -l -e "->filelist" -- drivers/gpu/drm/
>> > > drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> > > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> > > drivers/gpu/drm/drm_client.c
>> > > drivers/gpu/drm/drm_debugfs.c
>> > > drivers/gpu/drm/drm_drv.c
>> > > drivers/gpu/drm/drm_file.c
>> > > drivers/gpu/drm/vmwgfx/vmwgfx_gem.c
>> > >
>> >
>> > The AMD / VMWGFX usage certainly looks wrong to me. If the file list was
>> > meant to be traversed by drivers there should be an exported iterator
>> > IMO.
>> >
>> > Matt
>> >
>> > >
>> > > Lucas De Marchi