[PATCH] drm/amd/display: Clear dm_state for fast updates
Kazlauskas, Nicholas
nicholas.kazlauskas at amd.com
Mon Jul 27 13:26:54 UTC 2020
On 2020-07-27 1:40 a.m., Mazin Rezk wrote:
> This patch fixes a race condition that causes a use-after-free during
> amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits
> are requested and the second one finishes before the first. Essentially,
> this bug occurs when the following sequence of events happens:
>
> 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is
> deferred to the workqueue.
>
> 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is
> deferred to the workqueue.
>
> 3. Commit #2 starts before commit #1, dm_state #1 is used in the
> commit_tail and commit #2 completes, freeing dm_state #1.
>
> 4. Commit #1 starts after commit #2 completes, uses the freed dm_state
> 1 and dereferences a freelist pointer while setting the context.
>
> Since this bug has only been spotted with fast commits, this patch fixes
> the bug by clearing the dm_state instead of using the old dc_state for
> fast updates. In addition, since dm_state is only used for its dc_state
> and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found,
> removing the dm_state should not have any consequences in fast updates.
>
> This use-after-free bug has existed for a while now, but only caused a
> noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate
> freelist pointer to middle of object") moving the freelist pointer from
> dm_state->base (which was unused) to dm_state->context (which is
> dereferenced).
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383
> Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast updates")
> Reported-by: Duncan <1i5t5.duncan at cox.net>
> Signed-off-by: Mazin Rezk <mnrzk at protonmail.com>
> ---
> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++++++++++++++-----
> 1 file changed, 27 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 86ffa0c2880f..710edc70e37e 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -8717,20 +8717,38 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
> * the same resource. If we have a new DC context as part of
> * the DM atomic state from validation we need to free it and
> * retain the existing one instead.
> + *
> + * Furthermore, since the DM atomic state only contains the DC
> + * context and can safely be annulled, we can free the state
> + * and clear the associated private object now to free
> + * some memory and avoid a possible use-after-free later.
> */
> - struct dm_atomic_state *new_dm_state, *old_dm_state;
>
> - new_dm_state = dm_atomic_get_new_state(state);
> - old_dm_state = dm_atomic_get_old_state(state);
> + for (i = 0; i < state->num_private_objs; i++) {
> + struct drm_private_obj *obj = state->private_objs[i].ptr;
>
> - if (new_dm_state && old_dm_state) {
> - if (new_dm_state->context)
> - dc_release_state(new_dm_state->context);
> + if (obj->funcs == adev->dm.atomic_obj.funcs) {
> + int j = state->num_private_objs-1;
>
> - new_dm_state->context = old_dm_state->context;
> + dm_atomic_destroy_state(obj,
> + state->private_objs[i].state);
> +
> + /* If i is not at the end of the array then the
> + * last element needs to be moved to where i was
> + * before the array can safely be truncated.
> + */
> + if (i != j)
> + state->private_objs[i] =
> + state->private_objs[j];
>
> - if (old_dm_state->context)
> - dc_retain_state(old_dm_state->context);
> + state->private_objs[j].ptr = NULL;
> + state->private_objs[j].state = NULL;
> + state->private_objs[j].old_state = NULL;
> + state->private_objs[j].new_state = NULL;
> +
> + state->num_private_objs = j;
> + break;
> + }
In the bug report itself I mentioned that I don't really like hacking
around the DRM core for resolving this patch but to go into more
specifics, it's really two issues of code maintenance:
1. It's iterating over internal structures and layout of private objects
in the state and modifying the state. The core doesn't really guarantee
how these things are going to be laid out and it may change in the future.
2. It's freeing an allocation we don't own from DM. DRM doesn't track
this state elsewhere for purposes of freeing, but nothing is really
stopping the core from doing this later down the line.
The implementation itself is correct from a technical perspective, but
I'd rather it reside in DRM as a helper for code maintenance purposes.
Regards,
Nicholas Kazlauskas
> }
> }
>
> --
> 2.27.0
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
More information about the amd-gfx
mailing list