[PATCH] drm/amdgpu: fix random data corruption for sdma 7
Alex Deucher
alexdeucher at gmail.com
Wed Oct 16 13:51:51 UTC 2024
On Wed, Oct 16, 2024 at 2:19 AM Min, Frank <Frank.Min at amd.com> wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Lijo,
> Thanks a lot for your review and suggestions.
>
> Here are the updated patches.
Patches look good, but I agree with Lijo that we should switch to a
dedicated sdma7 header at some point in case there are other
discrepancies, but I think that can be a later update. Some
additional comments below.
>
> Best Regards,
> Frank
>
> From: Frank Min <Frank.Min at amd.com>
> Date: Thu, 10 Oct 2024 16:41:32 +0800
> Subject: [PATCH 1/2] drm/amdgpu: fix random data corruption for sdma 7
>
> There is random data corruption caused by const fill, this is caused by
> write compression mode not correctly configured.
>
> So correct compression mode for const fill.
>
> Signed-off-by: Frank Min <Frank.Min at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h | 6 ++++++
> drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 3 ++-
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h
> index d8cf830916b9..9f43c14f0bb0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h
> @@ -4346,6 +4346,12 @@
> #define SDMA_PKT_CONSTANT_FILL_HEADER_sw_shift 16
> #define SDMA_PKT_CONSTANT_FILL_HEADER_SW(x) (((x) & SDMA_PKT_CONSTANT_FILL_HEADER_sw_mask) << SDMA_PKT_CONSTANT_FILL_HEADER_sw_shift)
>
> +/*define for compression field for sdma7*/
> +#define SDMA_PKT_CONSTANT_FILL_HEADER_compress_offset 0
> +#define SDMA_PKT_CONSTANT_FILL_HEADER_compress_mask 0x00000001
> +#define SDMA_PKT_CONSTANT_FILL_HEADER_compress_shift 16
> +#define SDMA_PKT_CONSTANT_FILL_HEADER_COMPRESS(x) (((x) & SDMA_PKT_CONSTANT_FILL_HEADER_compress_mask) << SDMA_PKT_CONSTANT_FILL_HEADER_compress_shift)
Might be better to put this in sdma_v7_0.c since it's not part of
sdma6. With that fixed:
Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
> +
> /*define for cache_policy field*/
> #define SDMA_PKT_CONSTANT_FILL_HEADER_cache_policy_offset 0
> #define SDMA_PKT_CONSTANT_FILL_HEADER_cache_policy_mask 0x00000007
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
> index a8763496aed3..573fb6a56f8b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
> @@ -1724,7 +1724,8 @@ static void sdma_v7_0_emit_fill_buffer(struct amdgpu_ib *ib,
> uint64_t dst_offset,
> uint32_t byte_count)
> {
> - ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_CONST_FILL);
> + ib->ptr[ib->length_dw++] = SDMA_PKT_CONSTANT_FILL_HEADER_OP(SDMA_OP_CONST_FILL) |
> + SDMA_PKT_CONSTANT_FILL_HEADER_COMPRESS(1);
> ib->ptr[ib->length_dw++] = lower_32_bits(dst_offset);
> ib->ptr[ib->length_dw++] = upper_32_bits(dst_offset);
> ib->ptr[ib->length_dw++] = src_data;
> --
> 2.43.0
>
> From: Frank Min <Frank.Min at amd.com>
> Date: Wed, 16 Oct 2024 14:06:01 +0800
> Subject: [PATCH 2/2] drm/amdgpu: fix typo for sdma6 constant fill packet
>
> Fix typo for sdma6 constant fill packet
>
> Signed-off-by: Frank Min <Frank.Min at amd.com>
Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
> index 208a1fa9d4e7..cb07327d294f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
> @@ -1726,7 +1726,7 @@ static void sdma_v6_0_emit_fill_buffer(struct amdgpu_ib *ib,
> uint64_t dst_offset,
> uint32_t byte_count)
> {
> - ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_CONST_FILL);
> + ib->ptr[ib->length_dw++] = SDMA_PKT_CONSTANT_FILL_HEADER_OP(SDMA_OP_CONST_FILL);
> ib->ptr[ib->length_dw++] = lower_32_bits(dst_offset);
> ib->ptr[ib->length_dw++] = upper_32_bits(dst_offset);
> ib->ptr[ib->length_dw++] = src_data;
> --
> 2.43.0
>
> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Wednesday, October 16, 2024 12:41 PM
> To: Min, Frank <Frank.Min at amd.com>; amd-gfx at lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>; Gao, Likun <Likun.Gao at amd.com>; Olsak, Marek <Marek.Olsak at amd.com>
> Subject: Re: [PATCH] drm/amdgpu: fix random data corruption for sdma 7
>
>
>
> On 10/16/2024 8:29 AM, Min, Frank wrote:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> > From: Frank Min <Frank.Min at amd.com>
> >
> > There is random data corruption caused by const fill, this is caused by write compression mode not correclt configured.
> >
> > So correct compression mode for const fill.
> >
> > Signed-off-by: Frank Min <Frank.Min at amd.com>
> > ---
> > drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h | 5 ++++-
> > drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 3 ++-
> > 2 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h
> > b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h
> > index d8cf830916b9..18e73057e6ba 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h
> > @@ -116,11 +116,14 @@
> > #define SDMA_PKT_COPY_LINEAR_HEADER_sub_op_shift 8 #define
> > SDMA_PKT_COPY_LINEAR_HEADER_SUB_OP(x) (((x) &
> > SDMA_PKT_COPY_LINEAR_HEADER_sub_op_mask) <<
> > SDMA_PKT_COPY_LINEAR_HEADER_sub_op_shift)
> >
> > -/*define for encrypt field*/
> > +/*define for encrypt/compress field*/
> > #define SDMA_PKT_COPY_LINEAR_HEADER_encrypt_offset 0
> > #define SDMA_PKT_COPY_LINEAR_HEADER_encrypt_mask 0x00000001
> > #define SDMA_PKT_COPY_LINEAR_HEADER_encrypt_shift 16
> > +/*sdma6 use bit 16 for data encryption*/
> > #define SDMA_PKT_COPY_LINEAR_HEADER_ENCRYPT(x) (((x) &
> > SDMA_PKT_COPY_LINEAR_HEADER_encrypt_mask) <<
> > SDMA_PKT_COPY_LINEAR_HEADER_encrypt_shift)
> > +/*sdma7 use bit 16 for dcc compression*/ #define
> > +SDMA_PKT_COPY_LINEAR_HEADER_COMPRESS(x) (((x) &
> > +SDMA_PKT_COPY_LINEAR_HEADER_encrypt_mask) <<
> > +SDMA_PKT_COPY_LINEAR_HEADER_encrypt_shift)
> >
>
> This doesn't look like the right way. The confusion is there because it uses SDMA_PKT_COPY_LINEAR_HEADER_OP instead of SDMA_PKT_CONSTANT_FILL_HEADER_OP.
>
> It's better to
>
> 1) Bring in sdma 7 packet header definition (Mixing with SDMA 6 is also not good)
>
> 2) Use SDMA_PKT_CONSTANT_FILL_HEADER_OP instead of SDMA_PKT_COPY_LINEAR_HEADER_OP
>
> 3) Use the correct bit mask/shift as defined in sdma 7 packet header definition for const_fill.
>
> Thanks,
> Lijo
>
> > /*define for tmz field*/
> > #define SDMA_PKT_COPY_LINEAR_HEADER_tmz_offset 0 diff --git
> > a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
> > b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
> > index a8763496aed3..9181579ae6a1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
> > @@ -1724,7 +1724,8 @@ static void sdma_v7_0_emit_fill_buffer(struct amdgpu_ib *ib,
> > uint64_t dst_offset,
> > uint32_t byte_count) {
> > - ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_CONST_FILL);
> > + ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_CONST_FILL) |
> > + SDMA_PKT_COPY_LINEAR_HEADER_COMPRESS(1);
> > ib->ptr[ib->length_dw++] = lower_32_bits(dst_offset);
> > ib->ptr[ib->length_dw++] = upper_32_bits(dst_offset);
> > ib->ptr[ib->length_dw++] = src_data;
> > --
> > 2.43.0
> >
More information about the amd-gfx
mailing list