[PATCH 4/5] drm/amdgpu: MCBP based on DRM scheduler (v8)
Michel Dänzer
michel at daenzer.net
Wed Nov 2 11:26:11 UTC 2022
[ Dropping Andrey's no longer working address from Cc ]
On 2022-11-01 11:09, Michel Dänzer wrote:
> On 2022-11-01 10:58, Zhu, Jiadong wrote:
>>
>>> Patch 3 assigns preempt_ib in gfx_v9_0_sw_ring_funcs_gfx, but not in gfx_v9_0_ring_funcs_gfx. mux->real_ring in amdgpu_mcbp_trigger_preempt presumably uses the latter, which would explain why amdgpu_ring_preempt_ib ends up dereferencing a NULL pointer.
>>
>> It's weird the assignment should be in gfx_v9_0_ring_funcs_gfx instead of gfx_v9_0_sw_ring_funcs_gfx.
>>
>> [PATCH 3/5] drm/amdgpu: Modify unmap_queue format for gfx9 (v4):
>> @@ -6925,6 +7047,7 @@ static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_gfx = {
>> .emit_cntxcntl = gfx_v9_ring_emit_cntxcntl,
>> .init_cond_exec = gfx_v9_0_ring_emit_init_cond_exec,
>> .patch_cond_exec = gfx_v9_0_ring_emit_patch_cond_exec,
>> + .preempt_ib = gfx_v9_0_ring_preempt_ib,
>> .emit_frame_cntl = gfx_v9_0_ring_emit_frame_cntl,
>> .emit_wreg = gfx_v9_0_ring_emit_wreg,
>> .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15d.h b/drivers/gpu/drm/amd/amdgpu/soc15d.h
>
> Ah! Looks like stg applied patch 3 incorrectly for me. :(
>
> I'll try and test with this fixed this week, and report back.
I'm now running with patch 3 applied correctly, and with patch 5 as well.
The good news is that I'm now seeing a positive effect with GpuTest benchmarks which are GPU-limited at low frame rates. In particular, with the pixmark piano benchmark, the GNOME Wayland session now actually stays more responsive on this machine than it does on my work laptop with an Intel iGPU. However, with the plot3d benchmark (with /plot3d_vertex_density=1750 on the command line to increase GPU load), it still doesn't quite manage to keep the desktop running at full frame rate, in contrast to the Intel iGPU.
The bad news is that this series still makes some things very slow. The most extreme examples so far are glxgears (runs at ~400 fps now, ~7000 fps before, i.e. almost 20x slowdown) and hexchat (scrolling one page now takes ~1 second, I can see it drawing line by line; before it was almost instantaneous). I suspect this series makes the overhead of running a single GPU job much bigger. On the bright side, I'm not noticing any significant intermittent freezes anymore.
In summary, while the benefits are promising, the downsides are unacceptable for enabling this by default.
--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and Xwayland developer
More information about the amd-gfx
mailing list