[PATCH 0/6] Introduce a generic function to get the CSB buffer

Rodrigo Siqueira siqueira at igalia.com
Sat Apr 12 20:29:52 UTC 2025


On 04/08, Rodrigo Siqueira wrote:
> On 04/07, Alex Deucher wrote:
> > On Mon, Apr 7, 2025 at 4:15 PM Rodrigo Siqueira <siqueira at igalia.com> wrote:
> > >
> > > On 04/07, Alex Deucher wrote:
> > > > On Sun, Apr 6, 2025 at 7:07 PM Rodrigo Siqueira <siqueira at igalia.com> wrote:
> > > > >
> > > > > This patchset was inspired and made on top of the below series:
> > > > >
> > > > > https://lore.kernel.org/amd-gfx/20250319162225.3775315-1-alexander.deucher@amd.com/
> > > > >
> > > > > In the above series, there is a bug fix in many functions named
> > > > > gfx_vX_0_get_csb_buffer (where X ranges from 6 to 11). After closely
> > > > > looking at those functions, it became clear that most of the code could
> > > > > be shared from gfx6 to gfx11. Aside from the code duplication removal,
> > > > > this also improves maintainability since a bug fix in a shared function
> > > > > can be propagated to all ASICs.
> > > > >
> > > > > The first patch of this series created one dedicated file for
> > > > > encapsulating common GC registers (gc_common_offset.h); this series only
> > > > > adds registers associated with the CSB. In the future, this file can
> > > > > keep growing as we identify common registers to be shared in the
> > > > > gc_common_offset.h.
> > > > >
> > > > > The second patch introduces the generic gfx_get_csb_buffer function,
> > > > > which has the same implementation found in gfx_v10_0_get_csb_buffer and
> > > > > gfx_v11_0_get_csb_buffer (these two functions have the same code). After
> > > > > that, every patch is dedicated to absorbing one of the csb_buffer
> > > > > functions from gfx from 9 to 6; notice that some adaptations were
> > > > > required.
> > > >
> > > > I don't really like the register header changes and moving all of the
> > > > IP version specific logic into the common code.  These register
> > > > headers are used in other places as well and moving some registers
> > > > into a common header can get confusing and may lead to bugs later if
> > > > other chips change the offset of these registers.
> > >
> > > In that case, what do you think if I just abstract the first part of the
> > > function for a V2? The first part is the same for all gfx from 6 to 11.
> > > Something like this:
> > >
> > > int gfx_get_pre_setup_csb_buffer(struct amdgpu_device *adev, volatile u32 *buffer)
> > > {
> > >        u32 count = 0, i;
> > >        const struct cs_section_def *sect = NULL;
> > >        const struct cs_extent_def *ext = NULL;
> > >        int ctx_reg_offset;
> > >
> > >        if (adev->gfx.rlc.cs_data == NULL)
> > >                return 1;
> > >        if (buffer == NULL)
> > >                return 1;
> > >
> > >        buffer[count++] = cpu_to_le32(PACKET3(PACKET3_PREAMBLE_CNTL, 0));
> > >        buffer[count++] = cpu_to_le32(PACKET3_PREAMBLE_BEGIN_CLEAR_STATE);
> > >
> > >        buffer[count++] = cpu_to_le32(PACKET3(PACKET3_CONTEXT_CONTROL, 1));
> > >        buffer[count++] = cpu_to_le32(0x80000000);
> > >        buffer[count++] = cpu_to_le32(0x80000000);
> > >
> > >        for (sect = adev->gfx.rlc.cs_data; sect->section != NULL; ++sect) {
> > >                for (ext = sect->section; ext->extent != NULL; ++ext) {
> > >                        if (sect->id == SECT_CONTEXT) {
> > >                                buffer[count++] =
> > >                                        cpu_to_le32(PACKET3(PACKET3_SET_CONTEXT_REG, ext->reg_count));
> > >                                buffer[count++] = cpu_to_le32(ext->reg_index -
> > >                                                PACKET3_SET_CONTEXT_REG_START);
> > >                                for (i = 0; i < ext->reg_count; i++)
> > >                                        buffer[count++] = cpu_to_le32(ext->extent[i]);
> > >                        }
> > >                }
> > >        }
> > >
> > >        return 0;
> > > }
> > 
> > Sure helpers are fine.  maybe a preamble_start helper
> >         buffer[count++] = cpu_to_le32(PACKET3(PACKET3_PREAMBLE_CNTL, 0));
> >         buffer[count++] = cpu_to_le32(PACKET3_PREAMBLE_BEGIN_CLEAR_STATE);
> > 
> >         buffer[count++] = cpu_to_le32(PACKET3(PACKET3_CONTEXT_CONTROL, 1));
> >         buffer[count++] = cpu_to_le32(0x80000000);
> >         buffer[count++] = cpu_to_le32(0x80000000);
> > and a preamble_end helper:
> >         buffer[count++] = cpu_to_le32(PACKET3(PACKET3_PREAMBLE_CNTL, 0));
> >         buffer[count++] = cpu_to_le32(PACKET3_PREAMBLE_END_CLEAR_STATE);
> > 
> >         buffer[count++] = cpu_to_le32(PACKET3(PACKET3_CLEAR_STATE, 0));
> >         buffer[count++] = cpu_to_le32(0);
> > and a cs_data parser helper:
> >         for (sect = adev->gfx.rlc.cs_data; sect->section != NULL; ++sect) {
> >             for (ext = sect->section; ext->extent != NULL; ++ext) {
> >                         if (sect->id == SECT_CONTEXT) {
> >                 buffer[count++] =
> > 
> > cpu_to_le32(PACKET3(PACKET3_SET_CONTEXT_REG, ext->reg_count));
> >                                 buffer[count++] = cpu_to_le32(ext->reg_index -
> >                                                 PACKET3_SET_CONTEXT_REG_START);
> >                                 for (i = 0; i < ext->reg_count; i++)
> >                                         buffer[count++] =
> > cpu_to_le32(ext->extent[i]);
> >                         }
> >                 }
> >     }
> 
> Nice. I'll prepare a V2.
> 
> > 
> > Also, while you are at it, you could clean up gfx_v9_0_cp_gfx_start(),
> > etc. to use the cs buffer and cs size directly rather than re-parsing
> > everything again.

I was looking into the gfx_vX_0_cp_gfx_start() functions, but I was
slightly confused about how to approach this part. I can see from
gfx_vX_0_cp_gfx_start() that a preamble start, parser, and preamble end
are common parts, but I felt I could not re-use some of the helpers
created in the V2 of this series because gfx_vX_0_cp_gfx_start() writes
directly in the ring buffer and it has some special checks. Should I
create a dedicated helper for those parts?  Or did I misunderstand what
you suggested?

> 
> Sure, I'll make it as a separate patchset.
> 
> Thanks
> 
> > 
> > Alex
> > 
> > >
> > > Thanks
> > >
> > > >
> > > > Alex
> > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > Rodrigo Siqueira (6):
> > > > >   drm/amd/amdgpu: Create a headears to keep some common GC registers
> > > > >   drm/amdgpu/gfx: Introduce generic gfx_get_csb_buffer
> > > > >   drm/amdgpu/gfx: Integrate gfx_v9_0_get_csb_buffer into
> > > > >     gfx_get_csb_buffer
> > > > >   drm/amdgpu/gfx: Absorb gfx_v8_0_get_csb_buffer into gfx_get_csb_buffer
> > > > >   drm/amdgpu/gfx: Assimilate gfx_v7_0_get_csb_buffer into
> > > > >     gfx_get_csb_buffer
> > > > >   drm/amdgpu/gfx: Merge gfx_v6_0_get_csb_buffer into gfx_get_csb_buffer
> > > > >
> > > > >  Documentation/gpu/amdgpu/amdgpu-glossary.rst  |   3 +
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c       | 101 ++++++++++++++++++
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h       |   1 +
> > > > >  drivers/gpu/drm/amd/amdgpu/cik.c              |   2 +
> > > > >  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c        |  51 +--------
> > > > >  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c        |  53 +--------
> > > > >  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c         |  46 +-------
> > > > >  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c         |  70 +-----------
> > > > >  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c         |  51 +--------
> > > > >  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c         |  43 +-------
> > > > >  drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c         |   1 +
> > > > >  drivers/gpu/drm/amd/amdgpu/si.c               |   2 +
> > > > >  drivers/gpu/drm/amd/amdgpu/vi.c               |   2 +
> > > > >  .../include/asic_reg/gc/gc_10_1_0_offset.h    |   3 -
> > > > >  .../include/asic_reg/gc/gc_10_3_0_offset.h    |   3 -
> > > > >  .../include/asic_reg/gc/gc_11_0_0_offset.h    |   2 -
> > > > >  .../include/asic_reg/gc/gc_11_0_3_offset.h    |   2 -
> > > > >  .../include/asic_reg/gc/gc_11_5_0_offset.h    |   2 -
> > > > >  .../include/asic_reg/gc/gc_12_0_0_offset.h    |   2 -
> > > > >  .../amd/include/asic_reg/gc/gc_9_0_offset.h   |   3 -
> > > > >  .../amd/include/asic_reg/gc/gc_9_1_offset.h   |   3 -
> > > > >  .../amd/include/asic_reg/gc/gc_9_2_1_offset.h |   3 -
> > > > >  .../amd/include/asic_reg/gc/gc_9_4_2_offset.h |   2 -
> > > > >  .../amd/include/asic_reg/gc/gc_9_4_3_offset.h |   2 -
> > > > >  .../include/asic_reg/gc/gc_common_offset.h    |  11 ++
> > > > >  .../drm/amd/include/asic_reg/gca/gfx_6_0_d.h  |   1 -
> > > > >  .../drm/amd/include/asic_reg/gca/gfx_7_0_d.h  |   1 -
> > > > >  .../drm/amd/include/asic_reg/gca/gfx_7_2_d.h  |   1 -
> > > > >  .../drm/amd/include/asic_reg/gca/gfx_8_0_d.h  |   1 -
> > > > >  .../drm/amd/include/asic_reg/gca/gfx_8_1_d.h  |   1 -
> > > > >  30 files changed, 141 insertions(+), 328 deletions(-)
> > > > >  create mode 100644 drivers/gpu/drm/amd/include/asic_reg/gc/gc_common_offset.h
> > > > >
> > > > > --
> > > > > 2.49.0
> > > > >
> > >
> > > --
> > > Rodrigo Siqueira
> 
> -- 
> Rodrigo Siqueira

-- 
Rodrigo Siqueira


More information about the amd-gfx mailing list