Re: Re:[PATCH] drm/amdgpu: resvert "disable bulk moves for now"

Pierre-Loup A. Griffais pgriffais at valvesoftware.com
Tue Sep 17 21:56:44 UTC 2019


Hello,

Applying this locally, the issue we were seeing with very high submit 
times in high-end workloads seems largely gone. My methodology is to 
measure the total time spent in DRM_IOCTL_AMDGPU_CS with `strace -T` for 
the whole first scene of the Shadow of the Tomb Raider benchmark, and 
divide by the frame count in that scene to get an idea of how much CPU 
time is spent in submissions per frame. More details below.

On a Vega20 system with a 3900X, at High settings (~6 gigs of VRAM usage 
according to UMR, no contention):

  - 5.2.14: 1.1ms per frame in CS

  - 5.2.14 + LRU bulk moves: 0.6ms per frame in CS

On a Polaris10 system with a i7-7820X, at Very High Settings (7.7G/8G 
VRAM used, no contention):

  - 5.2.15: 12.03ms per frame in CS (!)

  - 5.2.15 + LRU bulk moves:  1.35ms per frame in CS

The issue is largely addressed. 1.35ms is still higher than I'd expect, 
but it's still pretty reasonable. Note that on many of our usecases, 
submission happens in a separate thread and doesn't typically impact 
overall frame time/latency if you have extra CPU cores to work with. 
However it very negatively affects performance as soon as the CPU gets 
saturated, and burns a ton of power.

Thanks!

  - Pierre-Loup

Methodology details:

# Mesa patched to kill() itself with SIGCONT in vkQueuePresent to act as 
a frame marker in-band with the strace data.

# strace collection:

strace -f -p 13113 -e ioctl,kill -o sottr_first_scene_vanilla -T

# frame count:

cat sottr_first_scene_vanilla | grep kill\( | wc -l
616

# total time spent in _CS:

cat sottr_first_scene_vanilla | grep AMDGPU_CS | grep -v unfinished | tr 
-s ' '  | cut -d ' ' -f7 | tr -d \< | tr -d \> | xargs  | tr ' ' '+' | bc
7.411782

# seconds to milliseconds, then divide by frame count

(gdb) p 7.41 * 1000.0 / 616.0
$1 = 12.029220779220779

On 9/12/19 8:18 AM, Zhou, David(ChunMing) wrote:
> I dont know dkms status,anyway, we should submit this one as early as 
> possible.
>
> -------- 原始邮件 --------
> 主题:Re: [PATCH] drm/amdgpu: resvert "disable bulk moves for now"
> 发件人:Christian König
> 收件人:"Zhou, David(ChunMing)" ,amd-gfx at lists.freedesktop.org
> 抄送:
>
> Just to double check: We do have that enabled in the DKMS package for a
> while and doesn't encounter any more problems with it, correct?
>
> Thanks,
> Christian.
>
> Am 12.09.19 um 16:02 schrieb Chunming Zhou:
> > RB on it to go ahead.
> >
> > -David
> >
> > 在 2019/9/12 18:15, Christian König 写道:
> >> This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a.
> >>
> >> The changes to fix this should have landed in 5.1.
> >>
> >> Signed-off-by: Christian König <christian.koenig at amd.com>
> >> ---
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 --
> >>    1 file changed, 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >> index 48349e4f0701..fd3fbaa73fa3 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >> @@ -603,14 +603,12 @@ void amdgpu_vm_move_to_lru_tail(struct 
> amdgpu_device *adev,
> >>       struct ttm_bo_global *glob = adev->mman.bdev.glob;
> >>       struct amdgpu_vm_bo_base *bo_base;
> >>
> >> -#if 0
> >>       if (vm->bulk_moveable) {
> >>               spin_lock(&glob->lru_lock);
> >> ttm_bo_bulk_move_lru_tail(&vm->lru_bulk_move);
> >>               spin_unlock(&glob->lru_lock);
> >>               return;
> >>       }
> >> -#endif
> >>
> >>       memset(&vm->lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
> >>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190917/53de55cf/attachment-0001.html>


More information about the amd-gfx mailing list