[PATCH v2 02/37] drm/nouveau: handle pci/tegra drm_dev_{alloc, register} from common code

Ben Skeggs bskeggs at nvidia.com
Fri Jul 26 13:07:19 UTC 2024


On 27/7/24 01:41, Danilo Krummrich wrote:

> On Fri, Jul 26, 2024 at 02:27:53PM +1000, Ben Skeggs wrote:
>>>>>> +
>>>>>> +static struct nouveau_drm *
>>>>>> +nouveau_drm_device_new(const struct drm_driver *drm_driver, struct device *parent,
>>>>>> +		       struct nvkm_device *device)
>>>>>> +{
>>>>>> +	struct nouveau_drm *drm;
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	drm = kzalloc(sizeof(*drm), GFP_KERNEL);
>>>>>> +	if (!drm)
>>>>>> +		return ERR_PTR(-ENOMEM);
>>>>>> +
>>>>>> +	drm->dev = drm_dev_alloc(drm_driver, parent);
>>>>> Since you're reworking this anyways, can we switch to devm_drm_dev_alloc()?
>>>>>
>>>>> This also gets us rid of nouveau_drm_device_del().
>>>> No, we can't.  I originally had this change as a cleanup patch in the series
>>>> I posted implementing aux bus support.  However it turns out that in order
>>>> to avoid breaking udev etc, we can't use the aux device as parent of the drm
>>> Can you please expand a bit on what was breaking?
>> Sorry, I meant to say PRIME, not udev.  The device selection logic ties the
>> DRM device back to its sysfs node, and doesn't understand the auxiliary
>> bus.  So, if nouveau were to use its auxiliary device as parent of the DRM
>> device, PRIME breaks.
> The Vulkan device selector stuff looks like it should mostly work.
>
> However, I guess you refer to the loader stuff in Mesa that uses
> drmGetDevices2() from libdrm? This stuff indeed whitelists busses it accepts to
> report DRM device from:
>
> 	{ "/pci", DRM_BUS_PCI },
> 	{ "/usb", DRM_BUS_USB },
> 	{ "/platform", DRM_BUS_PLATFORM },
> 	{ "/spi", DRM_BUS_PLATFORM },
> 	{ "/host1x", DRM_BUS_HOST1X },
> 	{ "/virtio", DRM_BUS_VIRTIO },
>
> Not a big deal to just add it for a new driver, but obviously we can't just do
> this for an existing one.
>
>> Fortunately it didn't turn out to be necessary, and we
>> can happily probe() against the auxiliary device and still use the PCI
>> device as the DRM device's parent.
> At a first glance, I guess this should work. But, before we introduce
> workarounds like this one and add even more complexity, I wonder what's the
> benefit of doing this for Nouveau in the first place? I think we agreed to this
> split for Nova, for the reasons discussed in [1].

Because, as I already mentioned in the cover letter for series I posted 
implementing the auxiliary bus support, this brings immediate benefits 
to users, such as eliminating the long pauses on systems using prime 
whilst the entire GPU is woken up for some PCI query by userspace.

It also (finally) integrates Tegra in a reasonably clean fashion, and 
would allow the DRM-level suspend/resume code to be shared there too if 
someone were to implement the platform-level code for it.  That was not 
possible before.

>
> [1] https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/
>
>>>> device and instead have to continue passing the pci/platform device as we do
>>>> now.
>>>>
>>>> Using devm_drm_dev_alloc() with the pci device as parent would tie the
>>>> lifetime of the drm device to the pci device, which is owned by nvkm (after
>>> How does this tie the lifetime of the drm device to the pci device? It's the
>>> other way around, the drm device takes a reference of its parent (i.e. the pci
>>> device).
>>>
>>> I don't think there's anything wrong with that.
>> My understanding is that devres will cleanup allocations when the driver
>> detaches from the device.
> Right, I think I took that too literally.
>
> The lifetime of the DRM device (or more precisely one of its references) is
> bound to the binding between the parent device and its corresponding driver.
>
> But the lifetime of the parent device itself is bound to the DRM device.
>
> So, yes this doesn't work, and proves the point that initializing the DRM device
> with the parent's parent is just a workaround.

You're greatly overstating the "complexity" that's added here. It's a 
minor inconvenience that doesn't require much code at all to implement, 
and is essentially irrelevant outside of module load/unload.

I agree it's not ideal, and userspace should gain auxiliary bus support 
before a new driver implements a similar architecture, but it's really 
not that big a deal.

>
>> With the auxdev changes, it's *NVKM* that's
>> attached to the PCI device, not the DRM driver (which is attached to an
>> auxiliary device instead).
>>
>> This means that the devm_drm_dev_init_release() won't be called when the DRM
>> driver detaches from its auxiliary device as it should, but when NVKM
>> detaches from the PCI device, which isn't the most obvious and could lead to
>> confusion.
>>
>> It also entirely blows up in the split module case as nouveau.ko is unloaded
>> already by the time NVKM detaches and drm_dev_put() gets called.
>>
>>>> the auxdev series).  We could look at changing devm_drm_dev_alloc() of
>>>> course, but I think that's best left until later.
>>> I don't think that this is necessary.


More information about the Nouveau mailing list