[PATCH] drm/edid/firmware: stop using throwaway platform device

Matthieu CHARETTE matthieu.charette at gmail.com
Sun Nov 13 19:26:59 UTC 2022


Hi,

I've tested the patch and I can confirm that it fixed the issue.
Tested on Fedora 36 with kernel 6.0.8.

Thanks,
Matthieu

On Tue, Nov 8 2022 at 04:40:52 PM +0100, Matthieu CHARETTE 
<matthieu.charette at gmail.com> wrote:
> I didn't test the patch yet. I will do. But even without testing I 
> can tell you that it will work (It will not crash).
> Currently when the crash occurs, all screens remain black after 
> resume. I'm not able to login with ssh neither. And logs end before 
> the suspend. So the crash seems to be some kind of kernel panic.
> 
> Matthieu
> 
> On Tue, Nov 8 2022 at 01:27:33 PM +0200, Jani Nikula 
> <jani.nikula at intel.com> wrote:
>> On Sun, 06 Nov 2022, Matthieu CHARETTE <matthieu.charette at gmail.com> 
>> wrote:
>>>  Hi,
>>> 
>>>  Can you tell me what are we waiting for? Maybe I can help.
>> 
>> Have you tried the patch? Is it an improvement over the status quo?
>> 
>> The "crash" is still ambiguous to me. Do you observe it with the 
>> patch?
>> Do you have logs? Etc.
>> 
>> BR,
>> Jani.
>> 
>> 
>>> 
>>>  Thanks.
>>> 
>>>  Matthieu
>>> 
>>>  On Wed, Oct 12 2022 at 07:16:29 PM +0200, Matthieu CHARETTE
>>>  <matthieu.charette at gmail.com> wrote:
>>>>  By crash, I mean that an error is returned here:
>>>>  
>>>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux.git/+/refs/heads/master/drivers/gpu/drm/drm_edid_load.c#195
>>>>  I don't really know what happens next, but on my machine the 
>>>> built-in
>>>>  screen and the external remains dark. Also the kernel seems to
>>>>  freeze. I suspect a kernel panic, but I'm not sure. Anyway, the 
>>>> error
>>>>  is definitely not well handled, and a fix would be great.
>>>>  Also, request_firmware() will crash if called for the first time 
>>>> on
>>>>  the resume path because the file system isn't reachable on the 
>>>> resume
>>>>  process. And no cache is available for this firmware. So I guess 
>>>> that
>>>>  in this case, request_firmware() returns an error.
>>>>  Suspend-plug-resume case is not my priority nether as long as it
>>>>  doesn't make the system crash (Which is currently the case).
>>>> 
>>>>  On Wed, Oct 12 2022 at 11:25:59 AM +0300, Jani Nikula
>>>>  <jani.nikula at intel.com> wrote:
>>>>>  On Tue, 11 Oct 2022, Matthieu CHARETTE 
>>>>> <matthieu.charette at gmail.com>
>>>>>  wrote:
>>>>>>   Currently the EDID is requested during the resume. But since 
>>>>>> it's
>>>>>>   requested too early, this means before the filesystem is 
>>>>>> mounted,
>>>>>>  the
>>>>>>   firmware request fails. This make the DRM driver crash when
>>>>>>  resuming.
>>>>>>   This kind of issue should be prevented by the firmware caching
>>>>>>  process
>>>>>>   which cache every firmware requested for the next resume. But
>>>>>>  since we
>>>>>>   are using a temporary device, the firmware isn't cached on 
>>>>>> suspend
>>>>>>   since the device doesn't work anymore.
>>>>>>   When using a non temporary device to get the EDID, the firmware
>>>>>>  will
>>>>>>   be cached on suspend for the next resume. So requesting the
>>>>>>  firmware
>>>>>>   during resume will succeed.
>>>>>>   But if the firmware has never been requested since the boot, 
>>>>>> this
>>>>>>   means that the monitor isn't plugged since the boot. The kernel
>>>>>>  will
>>>>>>   not be caching the EDID. So if we plug the monitor while the
>>>>>>  machine
>>>>>>   is suspended. The resume will fail to load the firmware. And 
>>>>>> the
>>>>>>  DRM
>>>>>>   driver will crash.
>>>>>>   So basically, your fix should solve the issue except for the 
>>>>>> case
>>>>>>   where the monitor hasn't been plugged since boot and is plugged
>>>>>>  while
>>>>>>   the machine is suspended.
>>>>>>   I hope I was clear. Tell me if I wasn't. I'm not really good at
>>>>>>  explaining.
>>>>> 
>>>>>  That was a pretty good explanation. The only thing I'm missing is
>>>>>  what
>>>>>  the failure mode is exactly when you claim the driver will 
>>>>> crash. Why
>>>>>  would request_firmware() "crash" if called for the first time on 
>>>>> the
>>>>>  resume path?
>>>>> 
>>>>>  I'm not sure I care much about not being able to load the 
>>>>> firmware
>>>>>  EDID
>>>>>  in the suspend-plug-resume case (as this can be remedied with a
>>>>>  subsequent modeset), but obviously any errors need to be handled
>>>>>  gracefully, without crashing.
>>>>> 
>>>>>  BR,
>>>>>  Jani.
>>>>> 
>>>>> 
>>>>>  --
>>>>>  Jani Nikula, Intel Open Source Graphics Center
>>>> 
>>>> 
>>> 
>>> 
>> 
>> --
>> Jani Nikula, Intel Open Source Graphics Center
> 
> 




More information about the dri-devel mailing list