[PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells mask on SOC15

Zhao, Yong Yong.Zhao at amd.com
Mon Feb 4 17:28:08 UTC 2019


The decision we need to make here is which value we should program mmCP_MEC_DOORBELL_RANGE_UPPER. It is not very consistent right now throughout our code. If we let CP to use all the doorbells left over by SDMA, IH and VCN, then we should probably use 1024 for mmCP_MEC_DOORBELL_RANGE_UPPER. If we continue to use adev->doorbell_index.userqueue_end * 2, as is currently in the code, we should limit the CP doorbells in the range of 0 to adev->doorbell_index.userqueue_end * 2 when allocating CP user queue doorbells.

It seems to me that either way would work.

Regards,
Yong
________________________________
From: Zeng, Oak
Sent: Monday, February 4, 2019 12:15 PM
To: Kuehling, Felix; Zhao, Yong; amd-gfx at lists.freedesktop.org
Subject: RE: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells mask on SOC15

Correct. MEC_DOORBELL_RANGE should cover both user and kernel CP (KIQ/HIQ/DIQ/amdgpu kernel MEC rings)queues. USERQUEUE_START/END should be renamed to USER_CP_QUEUE_START/END.

Thanks,
Oak

-----Original Message-----
From: Kuehling, Felix
Sent: Monday, February 4, 2019 11:35 AM
To: Zeng, Oak <Oak.Zeng at amd.com>; Zhao, Yong <Yong.Zhao at amd.com>; amd-gfx at lists.freedesktop.org
Subject: Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells mask on SOC15

I don't see anything about userqueue start/end in Yong's patch. That said, I think you're mixing up two different things. MEC_DOORBELL_RANGE is not the same as userqueues? We have MEC kernel queues as well that would have to fall in the same range. And we have SDMA user queues that don't belong in the MEC doorbell range. Therefore I think the name USERQUEUE_START/END is misleading.

Regards,
   Felix

On 2019-02-02 10:01 a.m., Zeng, Oak wrote:
>
> Hi Yong,
>
> mmCP_MEC_DOORBELL_RANGE_LOWER/UPPER are used at CP to double check a
> doorbell ringing routed to CP. It is a must to program. We have to
> keep userqueue_start/end. Sorry for the confusion.
>
> Regards,
>
> Oak
>
> *From:* amd-gfx <amd-gfx-bounces at lists.freedesktop.org> *On Behalf Of
> *Zhao, Yong
> *Sent:* Friday, February 1, 2019 7:04 PM
> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>;
> amd-gfx at lists.freedesktop.org; Zeng, Oak <Oak.Zeng at amd.com>
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user
> queue doorbells mask on SOC15
>
> Oak,
>
> It is a good idea to remove the userqueue_start/end. However, when
> doing so, I came to the following code in gfx_v9_0_kiq_init_register() :
>
>         /* enable the doorbell if requested */
>
>         if (ring->use_doorbell) {
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_LOWER,
>
>                     (adev->doorbell_index.kiq * 2) << 2);
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_UPPER,
>
> (adev->doorbell_index.userqueue_end * 2) << 2);
>
>         }
>
> I remember that you reached the conclusion in one email thread that
> mmCP_MEC_DOORBELL_RANGE_UPPER/LOWER did not have effect at all on
> gfx9. Have we got any firm conclusions from HW engineer in the end? If
> those two registers are still valid, we should enforce the range for
> CP queues.
>
> Regards,
>
> Yong
>
> ----------------------------------------------------------------------
> --
>
> *From:*Kuehling, Felix
> *Sent:* Friday, February 1, 2019 3:03 PM
> *To:* Zhao, Yong; amd-gfx at lists.freedesktop.org
> <mailto:amd-gfx at lists.freedesktop.org>
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user
> queue doorbells mask on SOC15
>
> On 2019-01-31 5:27 p.m., Zhao, Yong wrote:
> > Reserved doorbells for SDMA IH and VCN were not properly masked out
> > when allocating doorbells for CP user queues. This patch fixed that.
> >
> > Change-Id: I670adfc3fd7725d2ed0bd9665cb7f69f8b9023c2
> > Signed-off-by: Yong Zhao <Yong.Zhao at amd.com
> > <mailto:Yong.Zhao at amd.com>>
>
> One mostly cosmetic comment inline. With that fixed, the series is
> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com
> <mailto:Felix.Kuehling at amd.com>>
>
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    | 17 ++++++++++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h  |  4 +++
> >   drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h         |  9 +++++++
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c      | 25
> >+++++++++++++------
> >   .../gpu/drm/amd/include/kgd_kfd_interface.h   | 19 +++++---------
> >   7 files changed, 56 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > index e957e42c539a..13710f34191a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > @@ -196,11 +196,20 @@ void amdgpu_amdkfd_device_init(struct
> amdgpu_device *adev)
> > gpu_resources.sdma_doorbell[1][i+1] =
> > adev->doorbell_index.sdma_engine[1] + 0x200 + (i >> 1);
> >                }
> > -             /* Doorbells 0x0e0-0ff and 0x2e0-2ff are reserved for
> > -              * SDMA, IH and VCN. So don't use them for the CP.
> > +
> > +             /* Because of the setting in registers like
> > +              * SDMA0_DOORBELL_RANGE etc., BIF statically uses the
> > +              * lower 12 bits of doorbell address for routing, in
> > +              * order to route the CP queue doorbells to CP engine,
> > +              * the doorbells allocated to CP queues have to be
> > +              * outside the range set for SDMA, VCN, and IH blocks
> > +              * Prior to SOC15, all queues use queue ID to
> > +              * determine doorbells.
> >                 */
> > -             gpu_resources.reserved_doorbell_mask = 0x1e0;
> > -             gpu_resources.reserved_doorbell_val  = 0x0e0;
> > +             gpu_resources.reserved_doorbells_start =
> > + adev->doorbell_index.sdma_engine[0];
> > +             gpu_resources.reserved_doorbells_end =
> > + adev->doorbell_index.last_non_cp;
> >
> >                kgd2kfd_device_init(adev->kfd.dev, &gpu_resources);
> >        }
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
> > index 1cfec06f81d4..4de431f7f380 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
> > @@ -71,6 +71,7 @@ struct amdgpu_doorbell_index {
> >                        uint32_t vce_ring6_7;
> >                } uvd_vce;
> >        };
> > +     uint32_t last_non_cp;
> >        uint32_t max_assignment;
> >        /* Per engine SDMA doorbell size in dword */
> >        uint32_t sdma_doorbell_range;  @@ -143,6 +144,7 @@ typedef
> >enum _AMDGPU_VEGA20_DOORBELL_ASSIGNMENT
> > AMDGPU_VEGA20_DOORBELL64_VCE_RING2_3             = 0x18D,
> > AMDGPU_VEGA20_DOORBELL64_VCE_RING4_5             = 0x18E,
> > AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7             = 0x18F,
> > + AMDGPU_VEGA20_DOORBELL64_LAST_NON_CP             =
> AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7,
> > AMDGPU_VEGA20_DOORBELL_MAX_ASSIGNMENT            = 0x18F,
> >AMDGPU_VEGA20_DOORBELL_INVALID                   = 0xFFFF
> >   } AMDGPU_VEGA20_DOORBELL_ASSIGNMENT;
> > @@ -222,6 +224,8 @@ typedef enum _AMDGPU_DOORBELL64_ASSIGNMENT
> >        AMDGPU_DOORBELL64_VCE_RING4_5             = 0xFE,
> >        AMDGPU_DOORBELL64_VCE_RING6_7             = 0xFF,
> >
> > +     AMDGPU_DOORBELL64_LAST_NON_CP             =
> AMDGPU_DOORBELL64_VCE_RING6_7,
> > +
> >        AMDGPU_DOORBELL64_MAX_ASSIGNMENT          = 0xFF,
> >        AMDGPU_DOORBELL64_INVALID                 = 0xFFFF
> >   } AMDGPU_DOORBELL64_ASSIGNMENT;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
> b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
> > index 4b5d60ea3e78..fa0433199215 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
> > @@ -81,6 +81,9 @@ void vega10_doorbell_index_init(struct
> amdgpu_device *adev)
> >        adev->doorbell_index.uvd_vce.vce_ring2_3 =
> AMDGPU_DOORBELL64_VCE_RING2_3;
> >        adev->doorbell_index.uvd_vce.vce_ring4_5 =
> AMDGPU_DOORBELL64_VCE_RING4_5;
> >        adev->doorbell_index.uvd_vce.vce_ring6_7 =
> AMDGPU_DOORBELL64_VCE_RING6_7;
> > +
> > +     adev->doorbell_index.last_non_cp =
> > +AMDGPU_DOORBELL64_LAST_NON_CP;
> > +
> >        /* In unit of dword doorbell */
> >        adev->doorbell_index.max_assignment =
> AMDGPU_DOORBELL64_MAX_ASSIGNMENT << 1;
> >        adev->doorbell_index.sdma_doorbell_range = 4;  diff --git
> >a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
> b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
> > index 53716c593b2b..b1052caaff5e 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
> > @@ -85,6 +85,9 @@ void vega20_doorbell_index_init(struct
> amdgpu_device *adev)
> >        adev->doorbell_index.uvd_vce.vce_ring2_3 =
> AMDGPU_VEGA20_DOORBELL64_VCE_RING2_3;
> >        adev->doorbell_index.uvd_vce.vce_ring4_5 =
> AMDGPU_VEGA20_DOORBELL64_VCE_RING4_5;
> >        adev->doorbell_index.uvd_vce.vce_ring6_7 =
> AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7;
> > +
> > +     adev->doorbell_index.last_non_cp =
> AMDGPU_VEGA20_DOORBELL64_LAST_NON_CP;
> > +
> >        adev->doorbell_index.max_assignment =
> AMDGPU_VEGA20_DOORBELL_MAX_ASSIGNMENT << 1;
> >        adev->doorbell_index.sdma_doorbell_range = 20;
> >   }
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > index e5ebcca7f031..6b8459f852cc 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > @@ -103,6 +103,15 @@
> >
> >   #define KFD_KERNEL_QUEUE_SIZE 2048
> >
> > +/* 512 = 0x200
> > + * On SOC15, the doorbell index distance for SDMA RLC i and (i + 1)
> in the
> > + * same SDMA engine, where i is a even number.
> > + * For 8-bytes doorbells, it ensures that the mirror doorbell range
> (in terms
> > + * of low 12 bit address for each HW engine) on the second doorbell
> page is
> > + * the same as the range of the first doorbell page.*/ #define
> > +KFD_QUEUE_DOORBELL_MIRROR_OFFSET 512
> > +
> > +
> >   /*
> >    * Kernel module parameter to specify maximum number of supported
> queues per
> >    * device
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > index 80b36e860a0a..c7b67c65d62d 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > @@ -607,13 +607,24 @@ static int init_doorbell_bitmap(struct
> qcm_process_device *qpd,
> >        if (!qpd->doorbell_bitmap)
> >                return -ENOMEM;
> >
> > -     /* Mask out any reserved doorbells */
> > -     for (i = 0; i < KFD_MAX_NUM_OF_QUEUES_PER_PROCESS; i++)
> > -             if ((dev->shared_resources.reserved_doorbell_mask & i)
> > ==
> > - dev->shared_resources.reserved_doorbell_val) {
> > -                     set_bit(i, qpd->doorbell_bitmap);
> > -                     pr_debug("reserved doorbell 0x%03x\n", i);
> > -             }
> > +     /* Mask out all reserved doorbells for SDMA, IH, and VCN on SOC15.
> > +      * Because of the setting in registers like
> SDMA0_DOORBELL_RANGE etc.,
> > +      * BIF statically uses the lower 12 bits of doorbell address
> > +for
> > +      * routing. In order to route the CP queue doorbells to CP
> > +engine,
> > +      * the doorbells allocated to CP queues have to be outside the
> range
> > +      * set for SDMA, VCN, and IH blocks.
> > +      * Prior to SOC15, all queues use queue ID to
> > +      * determine doorbells. */
> > +     i = dev->shared_resources.reserved_doorbells_start;
> > +     while (i <= dev->shared_resources.reserved_doorbells_end) {
>
> This should be a for-loop to be more obvious:
>
>          for (i = dev->shared_resources.reserved_doorbells_start;
>               i <= dev->shared_resources.reserved_doorbells_end;
>               i++) {
>                  ...
>          }
>
>
> > +             set_bit(i, qpd->doorbell_bitmap);
> > +             set_bit(i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET,
> > + qpd->doorbell_bitmap);
> > +             pr_debug("reserved doorbell 0x%03x and 0x%03x\n", i,
> > +                     i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET);
> > +
> > +             i++;
> > +     }
> >
> >        return 0;
> >   }
> > diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> > index 83d960110d23..b1bf45419d93 100644
> > --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> > +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> > @@ -137,20 +137,13 @@ struct kgd2kfd_shared_resources {
> >        /* Bit n == 1 means Queue n is available for KFD */
> >        DECLARE_BITMAP(queue_bitmap, KGD_MAX_QUEUES);
> >
> > -     /* Doorbell assignments (SOC15 and later chips only). Only
> > -      * specific doorbells are routed to each SDMA engine. Others
> > -      * are routed to IH and VCN. They are not usable by the CP.
> > -      *
> > -      * Any doorbell number D that satisfies the following
> > condition
> > -      * is reserved: (D & reserved_doorbell_mask) ==
> reserved_doorbell_val
> > -      *
> > -      * KFD currently uses 1024 (= 0x3ff) doorbells per process. If
> > -      * doorbells 0x0e0-0x0ff and 0x2e0-0x2ff are reserved, that
> >means
> > -      * mask would be set to 0x1e0 and val set to 0x0e0.
> > -      */
> >        unsigned int sdma_doorbell[2][8];
> > -     unsigned int reserved_doorbell_mask;
> > -     unsigned int reserved_doorbell_val;
> > +
> > +     /* From SOC15 onwards, the doorbell indexes reserved for SDMA,
> > +IH,
> > +      * and VCN
> > +      */
> > +     unsigned int reserved_doorbells_start;
> > +     unsigned int reserved_doorbells_end;
> >
> >        /* Base address of doorbell aperture. */
> >        phys_addr_t doorbell_physical_address;
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190204/6c16d892/attachment-0001.html>


More information about the amd-gfx mailing list