[PATCH 4/5] drm/amdgpu: MCBP based on DRM scheduler (v8)

Zhu, Jiadong Jiadong.Zhu at amd.com
Thu Nov 3 02:58:52 UTC 2022


[AMD Official Use Only - General]

>The bad news is that this series still makes some things very slow. The most extreme examples so far are glxgears (runs at ~400 fps now, ~7000 fps before, i.e. almost 20x slowdown) and hexchat (scrolling one page now takes ~1 second, I can see it drawing line by line; before it was almost instantaneous). I suspect this series makes the overhead of running a single GPU job much bigger. On the bright side, I'm not noticing any significant intermittent freezes anymore.

Hi Michel,

Thanks for the trying.
Is there high priority jobs running while executing glxgears? I am running glxgears while submitting high priority ibs using amdgpu_test, the fps ranges from 6000~8000.

Continuous preemption and resubmission may cause the slow fps. Could you have a check about how fast the trailing fence seqNo expands. On my side, the increment of Last signaled trailing fence is < 10 in a second.


cat /sys/kernel/debug/dri/0/amdgpu_fence_info
--- ring 0 (gfx) ---
Last signaled fence          0x00000001
Last emitted                 0x00000001
Last signaled trailing fence 0x0000013c
Last emitted                 0x0000013c
Last preempted               0x00000000

Thanks,
Jiadong

-----Original Message-----
From: Michel Dänzer <michel at daenzer.net>
Sent: Wednesday, November 2, 2022 7:26 PM
To: Zhu, Jiadong <Jiadong.Zhu at amd.com>
Cc: Tuikov, Luben <Luben.Tuikov at amd.com>; Huang, Ray <Ray.Huang at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>; amd-gfx at lists.freedesktop.org
Subject: Re: [PATCH 4/5] drm/amdgpu: MCBP based on DRM scheduler (v8)


[ Dropping Andrey's no longer working address from Cc ]

On 2022-11-01 11:09, Michel Dänzer wrote:
> On 2022-11-01 10:58, Zhu, Jiadong wrote:
>>
>>> Patch 3 assigns preempt_ib in gfx_v9_0_sw_ring_funcs_gfx, but not in gfx_v9_0_ring_funcs_gfx. mux->real_ring in amdgpu_mcbp_trigger_preempt presumably uses the latter, which would explain why amdgpu_ring_preempt_ib ends up dereferencing a NULL pointer.
>>
>> It's weird the assignment should be in gfx_v9_0_ring_funcs_gfx instead of gfx_v9_0_sw_ring_funcs_gfx.
>>
>> [PATCH 3/5] drm/amdgpu: Modify unmap_queue format for gfx9 (v4):
>> @@ -6925,6 +7047,7 @@ static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_gfx = {
>>         .emit_cntxcntl = gfx_v9_ring_emit_cntxcntl,
>>         .init_cond_exec = gfx_v9_0_ring_emit_init_cond_exec,
>>         .patch_cond_exec = gfx_v9_0_ring_emit_patch_cond_exec,
>> +       .preempt_ib = gfx_v9_0_ring_preempt_ib,
>>         .emit_frame_cntl = gfx_v9_0_ring_emit_frame_cntl,
>>         .emit_wreg = gfx_v9_0_ring_emit_wreg,
>>         .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, diff --git
>> a/drivers/gpu/drm/amd/amdgpu/soc15d.h
>> b/drivers/gpu/drm/amd/amdgpu/soc15d.h
>
> Ah! Looks like stg applied patch 3 incorrectly for me. :(
>
> I'll try and test with this fixed this week, and report back.

I'm now running with patch 3 applied correctly, and with patch 5 as well.


The good news is that I'm now seeing a positive effect with GpuTest benchmarks which are GPU-limited at low frame rates. In particular, with the pixmark piano benchmark, the GNOME Wayland session now actually stays more responsive on this machine than it does on my work laptop with an Intel iGPU. However, with the plot3d benchmark (with /plot3d_vertex_density=1750 on the command line to increase GPU load), it still doesn't quite manage to keep the desktop running at full frame rate, in contrast to the Intel iGPU.

The bad news is that this series still makes some things very slow. The most extreme examples so far are glxgears (runs at ~400 fps now, ~7000 fps before, i.e. almost 20x slowdown) and hexchat (scrolling one page now takes ~1 second, I can see it drawing line by line; before it was almost instantaneous). I suspect this series makes the overhead of running a single GPU job much bigger. On the bright side, I'm not noticing any significant intermittent freezes anymore.


In summary, while the benefits are promising, the downsides are unacceptable for enabling this by default.


--
Earthling Michel Dänzer            |                  https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fredhat.com%2F&data=05%7C01%7CJiadong.Zhu%40amd.com%7Cb15fb94893a247d734ff08dabcc5265c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638029852189066953%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=awC3VH4zMdZGK9ayi8V3goI%2B%2FEkj0%2B2LL2VokYlLXSk%3D&reserved=0
Libre software enthusiast          |         Mesa and Xwayland developer



More information about the amd-gfx mailing list