[PATCH] drm/edid/firmware: stop using throwaway platform device
Matthieu CHARETTE
matthieu.charette at gmail.com
Sun Nov 13 19:26:59 UTC 2022
Hi,
I've tested the patch and I can confirm that it fixed the issue.
Tested on Fedora 36 with kernel 6.0.8.
Thanks,
Matthieu
On Tue, Nov 8 2022 at 04:40:52 PM +0100, Matthieu CHARETTE
<matthieu.charette at gmail.com> wrote:
> I didn't test the patch yet. I will do. But even without testing I
> can tell you that it will work (It will not crash).
> Currently when the crash occurs, all screens remain black after
> resume. I'm not able to login with ssh neither. And logs end before
> the suspend. So the crash seems to be some kind of kernel panic.
>
> Matthieu
>
> On Tue, Nov 8 2022 at 01:27:33 PM +0200, Jani Nikula
> <jani.nikula at intel.com> wrote:
>> On Sun, 06 Nov 2022, Matthieu CHARETTE <matthieu.charette at gmail.com>
>> wrote:
>>> Hi,
>>>
>>> Can you tell me what are we waiting for? Maybe I can help.
>>
>> Have you tried the patch? Is it an improvement over the status quo?
>>
>> The "crash" is still ambiguous to me. Do you observe it with the
>> patch?
>> Do you have logs? Etc.
>>
>> BR,
>> Jani.
>>
>>
>>>
>>> Thanks.
>>>
>>> Matthieu
>>>
>>> On Wed, Oct 12 2022 at 07:16:29 PM +0200, Matthieu CHARETTE
>>> <matthieu.charette at gmail.com> wrote:
>>>> By crash, I mean that an error is returned here:
>>>>
>>>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux.git/+/refs/heads/master/drivers/gpu/drm/drm_edid_load.c#195
>>>> I don't really know what happens next, but on my machine the
>>>> built-in
>>>> screen and the external remains dark. Also the kernel seems to
>>>> freeze. I suspect a kernel panic, but I'm not sure. Anyway, the
>>>> error
>>>> is definitely not well handled, and a fix would be great.
>>>> Also, request_firmware() will crash if called for the first time
>>>> on
>>>> the resume path because the file system isn't reachable on the
>>>> resume
>>>> process. And no cache is available for this firmware. So I guess
>>>> that
>>>> in this case, request_firmware() returns an error.
>>>> Suspend-plug-resume case is not my priority nether as long as it
>>>> doesn't make the system crash (Which is currently the case).
>>>>
>>>> On Wed, Oct 12 2022 at 11:25:59 AM +0300, Jani Nikula
>>>> <jani.nikula at intel.com> wrote:
>>>>> On Tue, 11 Oct 2022, Matthieu CHARETTE
>>>>> <matthieu.charette at gmail.com>
>>>>> wrote:
>>>>>> Currently the EDID is requested during the resume. But since
>>>>>> it's
>>>>>> requested too early, this means before the filesystem is
>>>>>> mounted,
>>>>>> the
>>>>>> firmware request fails. This make the DRM driver crash when
>>>>>> resuming.
>>>>>> This kind of issue should be prevented by the firmware caching
>>>>>> process
>>>>>> which cache every firmware requested for the next resume. But
>>>>>> since we
>>>>>> are using a temporary device, the firmware isn't cached on
>>>>>> suspend
>>>>>> since the device doesn't work anymore.
>>>>>> When using a non temporary device to get the EDID, the firmware
>>>>>> will
>>>>>> be cached on suspend for the next resume. So requesting the
>>>>>> firmware
>>>>>> during resume will succeed.
>>>>>> But if the firmware has never been requested since the boot,
>>>>>> this
>>>>>> means that the monitor isn't plugged since the boot. The kernel
>>>>>> will
>>>>>> not be caching the EDID. So if we plug the monitor while the
>>>>>> machine
>>>>>> is suspended. The resume will fail to load the firmware. And
>>>>>> the
>>>>>> DRM
>>>>>> driver will crash.
>>>>>> So basically, your fix should solve the issue except for the
>>>>>> case
>>>>>> where the monitor hasn't been plugged since boot and is plugged
>>>>>> while
>>>>>> the machine is suspended.
>>>>>> I hope I was clear. Tell me if I wasn't. I'm not really good at
>>>>>> explaining.
>>>>>
>>>>> That was a pretty good explanation. The only thing I'm missing is
>>>>> what
>>>>> the failure mode is exactly when you claim the driver will
>>>>> crash. Why
>>>>> would request_firmware() "crash" if called for the first time on
>>>>> the
>>>>> resume path?
>>>>>
>>>>> I'm not sure I care much about not being able to load the
>>>>> firmware
>>>>> EDID
>>>>> in the suspend-plug-resume case (as this can be remedied with a
>>>>> subsequent modeset), but obviously any errors need to be handled
>>>>> gracefully, without crashing.
>>>>>
>>>>> BR,
>>>>> Jani.
>>>>>
>>>>>
>>>>> --
>>>>> Jani Nikula, Intel Open Source Graphics Center
>>>>
>>>>
>>>
>>>
>>
>> --
>> Jani Nikula, Intel Open Source Graphics Center
>
>
More information about the dri-devel
mailing list