[PATCH v2 1/2] drm/amdgpu: Move HDP remapping earlier during init

Alex Deucher alexdeucher at gmail.com
Thu Sep 1 19:39:07 UTC 2022


How about this?  We should just move HDP remap to gmc hw_init since
that is mainly what uses it anyway.

Alex

On Tue, Aug 30, 2022 at 2:05 PM Alex Deucher <alexdeucher at gmail.com> wrote:
>
> On Tue, Aug 30, 2022 at 12:06 PM Lazar, Lijo <lijo.lazar at amd.com> wrote:
> >
> >
> >
> > On 8/30/2022 8:39 PM, Alex Deucher wrote:
> > > On Tue, Aug 30, 2022 at 10:45 AM Lazar, Lijo <lijo.lazar at amd.com> wrote:
> > >>
> > >>
> > >>
> > >> On 8/30/2022 7:18 PM, Alex Deucher wrote:
> > >>> On Tue, Aug 30, 2022 at 12:05 AM Lazar, Lijo <lijo.lazar at amd.com> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 8/29/2022 10:20 PM, Alex Deucher wrote:
> > >>>>> On Mon, Aug 29, 2022 at 4:18 AM Lijo Lazar <lijo.lazar at amd.com> wrote:
> > >>>>>>
> > >>>>>> HDP flush is used early in the init sequence as part of memory controller
> > >>>>>> block initialization. Hence remapping of HDP registers needed for flush
> > >>>>>> needs to happen earlier.
> > >>>>>>
> > >>>>>> This also fixes the Unsupported Request error reported through AER during
> > >>>>>> driver load. The error happens as a write happens to the remap offset
> > >>>>>> before real remapping is done.
> > >>>>>>
> > >>>>>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216373&data=05%7C01%7Clijo.lazar%40amd.com%7Cbe8781fe1b0c41d3bad408da8a99b3d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637974690005710507%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=98WWFEcwi2tzyf%2BxnYC%2FK3UcCE5mfXI00qfYGUpt2Sk%3D&reserved=0
> > >>>>>>
> > >>>>>> The error was unnoticed before and got visible because of the commit
> > >>>>>> referenced below. This doesn't fix anything in the commit below, rather
> > >>>>>> fixes the issue in amdgpu exposed by the commit. The reference is only
> > >>>>>> to associate this commit with below one so that both go together.
> > >>>>>>
> > >>>>>> Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
> > >>>>>>
> > >>>>>> Reported-by: Tom Seewald <tseewald at gmail.com>
> > >>>>>> Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>
> > >>>>>> Cc: stable at vger.kernel.org
> > >>>>>
> > >>>>> How about something like the attached patch rather than these two
> > >>>>> patches?  It's a bit bigger but seems cleaner and more defensive in my
> > >>>>> opinion.
> > >>>>>
> > >>>>
> > >>>> Whenever device goes to suspend/reset and then comes back, remap offset
> > >>>> has to be set back to 0 to make sure it doesn't use the wrong offset
> > >>>> when the register assumes default values again.
> > >>>>
> > >>>> To avoid the if-check in hdp_flush (which is more frequent), another way
> > >>>> is to initialize the remap offset to default offset during early init
> > >>>> and hw fini/suspend sequences. It won't be obvious (even with this
> > >>>> patch) as to when remap offset vs default offset is used though.
> > >>>
> > >>> On resume, the common IP is resumed first so it will always be set.
> > >>> The only case that is a problem is init because we init GMC out of
> > >>> order.  We could init common before GMC in amdgpu_device_ip_init().  I
> > >>> think that should be fine, but I wasn't sure if there might be some
> > >>> fallout from that on certain cards.
> > >>>
> > >>
> > >> There are other places where an IP order is forced like in
> > >> amdgpu_device_ip_reinit_early_sriov(). This also may not affect this
> > >> case as remapping is not done for VF.
> > >>
> > >> Agree that a better way is to have the common IP to be inited first.
> > >
> > > How about these patches?
> > >
> >
> > Looks good to me. BTW, is nbio 7.7 for an APU (in which case hdp flush
> > is not expected to be used)?
>
> It would be used in some cases, e.g., GPU VM passthrough where we use
> the BAR rather than the carve out.
>
> Alex
>
>
> >
> > Thanks,
> > Lijo
> >
> > > Alex
> > >
> > >
> > >>
> > >> Thanks,
> > >> Lijo
> > >>
> > >>> Alex
> > >>>
> > >>>>
> > >>>> Thanks,
> > >>>> Lijo
> > >>>>
> > >>>>> Alex
> > >>>>>
> > >>>>>> ---
> > >>>>>> v2:
> > >>>>>>            Take care of IP resume cases (Alex Deucher)
> > >>>>>>            Add NULL check to nbio.funcs to cover older (GFXv8) ASICs (Felix Kuehling)
> > >>>>>>            Add more details in commit message and associate with AER patch (Bjorn
> > >>>>>> Helgaas)
> > >>>>>>
> > >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++++++++++++++++++++++
> > >>>>>>     drivers/gpu/drm/amd/amdgpu/nv.c            |  6 ------
> > >>>>>>     drivers/gpu/drm/amd/amdgpu/soc15.c         |  6 ------
> > >>>>>>     drivers/gpu/drm/amd/amdgpu/soc21.c         |  6 ------
> > >>>>>>     4 files changed, 24 insertions(+), 18 deletions(-)
> > >>>>>>
> > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >>>>>> index ce7d117efdb5..e420118769a5 100644
> > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >>>>>> @@ -2334,6 +2334,26 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
> > >>>>>>            return 0;
> > >>>>>>     }
> > >>>>>>
> > >>>>>> +/**
> > >>>>>> + * amdgpu_device_prepare_ip - prepare IPs for hardware initialization
> > >>>>>> + *
> > >>>>>> + * @adev: amdgpu_device pointer
> > >>>>>> + *
> > >>>>>> + * Any common hardware initialization sequence that needs to be done before
> > >>>>>> + * hw init of individual IPs is performed here. This is different from the
> > >>>>>> + * 'common block' which initializes a set of IPs.
> > >>>>>> + */
> > >>>>>> +static void amdgpu_device_prepare_ip(struct amdgpu_device *adev)
> > >>>>>> +{
> > >>>>>> +       /* Remap HDP registers to a hole in mmio space, for the purpose
> > >>>>>> +        * of exposing those registers to process space. This needs to be
> > >>>>>> +        * done before hw init of ip blocks to take care of HDP flush
> > >>>>>> +        * operations through registers during hw_init.
> > >>>>>> +        */
> > >>>>>> +       if (adev->nbio.funcs && adev->nbio.funcs->remap_hdp_registers &&
> > >>>>>> +           !amdgpu_sriov_vf(adev))
> > >>>>>> +               adev->nbio.funcs->remap_hdp_registers(adev);
> > >>>>>> +}
> > >>>>>>
> > >>>>>>     /**
> > >>>>>>      * amdgpu_device_ip_init - run init for hardware IPs
> > >>>>>> @@ -2376,6 +2396,8 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
> > >>>>>>                                    DRM_ERROR("amdgpu_vram_scratch_init failed %d\n", r);
> > >>>>>>                                    goto init_failed;
> > >>>>>>                            }
> > >>>>>> +
> > >>>>>> +                       amdgpu_device_prepare_ip(adev);
> > >>>>>>                            r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev);
> > >>>>>>                            if (r) {
> > >>>>>>                                    DRM_ERROR("hw_init %d failed %d\n", i, r);
> > >>>>>> @@ -3058,6 +3080,7 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev)
> > >>>>>>                    AMD_IP_BLOCK_TYPE_IH,
> > >>>>>>            };
> > >>>>>>
> > >>>>>> +       amdgpu_device_prepare_ip(adev);
> > >>>>>>            for (i = 0; i < adev->num_ip_blocks; i++) {
> > >>>>>>                    int j;
> > >>>>>>                    struct amdgpu_ip_block *block;
> > >>>>>> @@ -3139,6 +3162,7 @@ static int amdgpu_device_ip_resume_phase1(struct amdgpu_device *adev)
> > >>>>>>     {
> > >>>>>>            int i, r;
> > >>>>>>
> > >>>>>> +       amdgpu_device_prepare_ip(adev);
> > >>>>>>            for (i = 0; i < adev->num_ip_blocks; i++) {
> > >>>>>>                    if (!adev->ip_blocks[i].status.valid || adev->ip_blocks[i].status.hw)
> > >>>>>>                            continue;
> > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
> > >>>>>> index b3fba8dea63c..3ac7fef74277 100644
> > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
> > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
> > >>>>>> @@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
> > >>>>>>            nv_program_aspm(adev);
> > >>>>>>            /* setup nbio registers */
> > >>>>>>            adev->nbio.funcs->init_registers(adev);
> > >>>>>> -       /* remap HDP registers to a hole in mmio space,
> > >>>>>> -        * for the purpose of expose those registers
> > >>>>>> -        * to process space
> > >>>>>> -        */
> > >>>>>> -       if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
> > >>>>>> -               adev->nbio.funcs->remap_hdp_registers(adev);
> > >>>>>>            /* enable the doorbell aperture */
> > >>>>>>            nv_enable_doorbell_aperture(adev, true);
> > >>>>>>
> > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > >>>>>> index fde6154f2009..a0481e37d7cf 100644
> > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > >>>>>> @@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
> > >>>>>>            soc15_program_aspm(adev);
> > >>>>>>            /* setup nbio registers */
> > >>>>>>            adev->nbio.funcs->init_registers(adev);
> > >>>>>> -       /* remap HDP registers to a hole in mmio space,
> > >>>>>> -        * for the purpose of expose those registers
> > >>>>>> -        * to process space
> > >>>>>> -        */
> > >>>>>> -       if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
> > >>>>>> -               adev->nbio.funcs->remap_hdp_registers(adev);
> > >>>>>>
> > >>>>>>            /* enable the doorbell aperture */
> > >>>>>>            soc15_enable_doorbell_aperture(adev, true);
> > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
> > >>>>>> index 55284b24f113..16b447055102 100644
> > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> > >>>>>> @@ -660,12 +660,6 @@ static int soc21_common_hw_init(void *handle)
> > >>>>>>            soc21_program_aspm(adev);
> > >>>>>>            /* setup nbio registers */
> > >>>>>>            adev->nbio.funcs->init_registers(adev);
> > >>>>>> -       /* remap HDP registers to a hole in mmio space,
> > >>>>>> -        * for the purpose of expose those registers
> > >>>>>> -        * to process space
> > >>>>>> -        */
> > >>>>>> -       if (adev->nbio.funcs->remap_hdp_registers)
> > >>>>>> -               adev->nbio.funcs->remap_hdp_registers(adev);
> > >>>>>>            /* enable the doorbell aperture */
> > >>>>>>            soc21_enable_doorbell_aperture(adev, true);
> > >>>>>>
> > >>>>>> --
> > >>>>>> 2.25.1
> > >>>>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-drm-amdgpu-init-HDP-remap-as-part-of-GMC-hw_init-for.patch
Type: text/x-patch
Size: 4079 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20220901/53752c55/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-drm-amdgpu-add-HDP-remap-functionality-to-nbio-7.7.patch
Type: text/x-patch
Size: 1606 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20220901/53752c55/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-drm-amdgpu-init-HDP-remap-as-part-of-GMC-hw_init-for.patch
Type: text/x-patch
Size: 2552 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20220901/53752c55/attachment-0005.bin>


More information about the amd-gfx mailing list