[PATCH] drm/amdgpu: Fix S3 resume failre.

Andrey Grodzovsky Andrey.Grodzovsky at amd.com
Thu Jul 19 17:07:25 UTC 2018



On 07/19/2018 12:59 PM, Michel Dänzer wrote:
> On 2018-07-19 06:53 PM, Andrey Grodzovsky wrote:
>>
>> On 07/19/2018 12:47 PM, Michel Dänzer wrote:
>>> On 2018-07-19 06:33 PM, Andrey Grodzovsky wrote:
>>>> On 07/19/2018 11:39 AM, Ville Syrjälä wrote:
>>>>> On Thu, Jul 19, 2018 at 11:19:56AM -0400, Andrey Grodzovsky wrote:
>>>>>> Problem:
>>>>>> FB is still not unpinned during the first run of amdgpu_bo_evict_vram
>>>>>> and so it's left for the second run, but during second run the SDMA
>>>>>> for
>>>>>> moving buffer around already disabled and you have to do
>>>>>> it with CPU, but FB is not in visible VRAM and hence the eviction
>>>>>> failure
>>>>>> leading later to resume failure.
>>>>>>
>>>>>> Fix:
>>>>>> When DAL in use get a pointer to FB from crtc->primary->state rather
>>>>>> then from crtc->primary which is not set for DAL since it supports
>>>>>> atomic KMS.
>>>>>>
>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
>>>>>> Fixes e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic
>>>>>> drivers
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>>>>>>     1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index 709e4a3..dd9ebf7 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -2642,7 +2642,8 @@ int amdgpu_device_suspend(struct drm_device
>>>>>> *dev, bool suspend, bool fbcon)
>>>>>>         /* unpin the front buffers and cursors */
>>>>>>         list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>>>>>>             struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
>>>>>> -        struct drm_framebuffer *fb = crtc->primary->fb;
>>>>>> +         struct drm_framebuffer *fb =
>>>>>> amdgpu_device_has_dc_support(adev) ?
>>>>>> +                 crtc->primary->state->fb : crtc->primary->fb;
>>>>> So apparently you haven't yet turned off the planes here. If I'm
>>>>> reading things right amdgpu_device_ip_suspend() should end up doing
>>>>> that through drm_atomic_helper_suspend(). So it looks like like now
>>>>> you'll end up unpinning the same bos twice. Doesn't that mess up
>>>>> some kind of refcount or something?
>>>> amdgpu_bo_unpin has a guard against that, amdgpu_bo_unreserve is less
>>>> clear.
>>> BO reservation shouldn't an issue here, BOs are only reserved for a
>>> short time around (un)pinning them.
>>>
>>>
>>>>> To me it would seem better to susped the display before trying
>>>>> to evict the bos.
>>>> Yea, i was aware of that and indeed DAL shouldn't rely on the code in
>>>> amdgpu_device_suspend to unpin
>>>> front buffer and cursor since the atomic helper should do it. Problem is
>>>> that during amdgpu_device_ip_suspend
>>>> the SDMA engine gets suspended too, so you have to embed another
>>>> eviction in between, after display is suspended but before
>>>> SDMA and this forces ordering between them which kind of already in
>>>> place (amd_ip_block_type) but still it's an extra constrain.
>>> Ville's point (which I basically agree with) is that the display
>>> hardware should be turned off before evicting VRAM the first time, in
>>> which case no second eviction should be necessary (for this purpose).
>> Display HW is turned off as part of all IPs in a loop inside
>> amdgpu_device_ip_suspend.
>> Are you suggesting to extract the  display HW turn off from inside
>> amdgpu_device_ip_suspend and place it
>> before the first call to amdgpu_bo_evict_vram ?
> In a nutshell, yes.
>
> Or maybe it would be easier to move the amdgpu_bo_evict_vram call down
> to somewhere called from amdgpu_device_ip_suspend?

I can move the BEFORE and AFTER calls to amdgpu_bo_evict_vram inside 
amdgpu_device_ip_suspend
such that the first one is called AFTER display is shut off, while the 
second in the very end of the function.
I am just not sure what's gonna be the side effect of evicting after 
bunch of blocks (such as GMC) are already disabled.

Andrey

>
>



More information about the amd-gfx mailing list