[PATCHv2] drm/xe/display: check for error on drmm_mutex_init

Lucas De Marchi lucas.demarchi at intel.com
Wed Apr 3 15:32:50 UTC 2024


On Thu, Mar 28, 2024 at 12:33:09PM +0200, Jani Nikula wrote:
>On Thu, 28 Mar 2024, Andi Shyti <andi.shyti at linux.intel.com> wrote:
>> Hi Arun,
>>
>> ...
>>
>>> -	drmm_mutex_init(&xe->drm, &xe->sb_lock);
>>> -	drmm_mutex_init(&xe->drm, &xe->display.backlight.lock);
>>> -	drmm_mutex_init(&xe->drm, &xe->display.audio.mutex);
>>> -	drmm_mutex_init(&xe->drm, &xe->display.wm.wm_mutex);
>>> -	drmm_mutex_init(&xe->drm, &xe->display.pps.mutex);
>>> -	drmm_mutex_init(&xe->drm, &xe->display.hdcp.hdcp_mutex);
>>> +	if (drmm_mutex_init(&xe->drm, &xe->sb_lock) ||
>>> +	    drmm_mutex_init(&xe->drm, &xe->display.backlight.lock) ||
>>> +	    drmm_mutex_init(&xe->drm, &xe->display.audio.mutex) ||
>>> +	    drmm_mutex_init(&xe->drm, &xe->display.wm.wm_mutex) ||
>>> +	    drmm_mutex_init(&xe->drm, &xe->display.pps.mutex) ||
>>> +	    drmm_mutex_init(&xe->drm, &xe->display.hdcp.hdcp_mutex))
>>> +		return -ENOMEM;

My suggestion from v1 was to assign and check the return value, not to
hardcode the return like done here. Now we have a v3 going back to v1
and we never had what was suggested. Why? Let me be explicit and type
it:

	if ((err = drmm_mutex_init(&xe->drm, &xe->sb_lock)) ||
	    (err = drmm_mutex_init(&xe->drm, &xe->display.backlight.lock)) ||
	    (err = drmm_mutex_init(&xe->drm, &xe->display.audio.mutex)) ||
	    (err = drmm_mutex_init(&xe->drm, &xe->display.wm.wm_mutex)) ||
	    (err = drmm_mutex_init(&xe->drm, &xe->display.pps.mutex)) ||
	    (err = drmm_mutex_init(&xe->drm, &xe->display.hdcp.hdcp_mutex)))
		return err;

I also said I usually don't like assign + check in the same statement,
but all the alternatives I've seen here are worse.

However it turns out all of these display mutex initializations are
actually wrong after commit 3fef3e6ff86a ("drm/i915: move display mutex
inits to display code"), which predates xe in the tree.

	drivers/gpu/drm/i915/i915_driver.c:     intel_display_driver_early_probe(dev_priv);
	drivers/gpu/drm/xe/display/xe_display.c:        intel_display_driver_early_probe(xe);

So intel_display_driver_early_probe() is actually called from xe, which
does the mutex_init() (and misses the mutex_destroy()). Am I missing
anything?

>> why not extract the value from drmm_mutex_init()? it would make
>> the code a bit more complex, but better than forcing a -ENOMEM
>> return.
>>
>> 	err = drmm_mutex_init(...)
>> 	if (err)
>> 		return err;
>>
>> 	err = drmm_mutex_init(...)
>> 	if (err)
>> 		return err;
>>
>> 	err = drmm_mutex_init(...)
>> 	if (err)
>> 		return err;
>> 	
>> 	...
>>
>> On the other hand drmm_mutex_init(), as of now returns only
>> -ENOMEM, but it's a bad practice to assume it will always do. I'd
>> rather prefer not to check the error value at all.
>
>And round and round we go. This is exactly what v1 was [1], but it's not
>clear because the patch doesn't have a changelog.
>
>This is all utterly ridiculous compared to *why* we even have or use
>drmm_mutex_init(). Managed initialization causes more trouble here than
>it gains us. Gah.

I think managed initialization make sense to keep the teardown/unwind
part sane (which is often not tested). However drmm_mutex_init() maybe
is overkill indeed. We started using it because people often forget the
mutex_destroy() and drm/  as whole started using it. Compare:

	git grep mutex_init -- drivers/gpu/drm/i915/
	git grep mutex_destroy -- drivers/gpu/drm/i915/

This is only an issue when mutex_init does more than init, which is the
case with CONFIG_PREEMPT_RT + CONFIG_DEBUG_MUTEXES, which most people
don't have set so they don't see it, CI doesn't see it, but it causes
problems for people who have that set. Maybe what we could have would be
a drmm_mutex_vinit(mutex, ...) so we can do:

	err = drmm_mutex_vinit(&xe->drm,
			       &xe->sb_lock,
			       &xe->display.backlight.lock,
			       ...,
			       NULL);
	if (err)
		return err;

or... just stop using drmm_mutex_init and add the destroy.  No need for
unwind as mutex_init() can't fail. We still need to keep the destroy
explicit, but I think that would be fine (and doesn't cause 1 allocation
per mutex).

Lucas De Marchi

>
>BR,
>Jani.
>
>
>[1] https://lore.kernel.org/r/ki4ynsl4nmhavf63vzdlt2xkedjo7p7iouzvcksvki3okgz6ak@twlznnlo3g22
>
>
>>
>> Andi
>>
>>>  	xe->enabled_irq_mask = ~0;
>>>
>>>  	err = drmm_add_action_or_reset(&xe->drm, display_destroy, NULL);
>>> --
>>> 2.25.1
>
>-- 
>Jani Nikula, Intel


More information about the Intel-gfx mailing list