[PATCH v2 1/2] drm/amdgpu: Move HDP remapping earlier during init

Lazar, Lijo lijo.lazar at amd.com
Tue Aug 30 04:04:43 UTC 2022



On 8/29/2022 10:20 PM, Alex Deucher wrote:
> On Mon, Aug 29, 2022 at 4:18 AM Lijo Lazar <lijo.lazar at amd.com> wrote:
>>
>> HDP flush is used early in the init sequence as part of memory controller
>> block initialization. Hence remapping of HDP registers needed for flush
>> needs to happen earlier.
>>
>> This also fixes the Unsupported Request error reported through AER during
>> driver load. The error happens as a write happens to the remap offset
>> before real remapping is done.
>>
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216373&data=05%7C01%7Clijo.lazar%40amd.com%7C0882d00080124386814a08da89de9bcd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637973886457404198%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dC%2BCY22cfix1VCcQINrvNWI5XW%2BYV5lleJX3Ju9A6Iw%3D&reserved=0
>>
>> The error was unnoticed before and got visible because of the commit
>> referenced below. This doesn't fix anything in the commit below, rather
>> fixes the issue in amdgpu exposed by the commit. The reference is only
>> to associate this commit with below one so that both go together.
>>
>> Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
>>
>> Reported-by: Tom Seewald <tseewald at gmail.com>
>> Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>
>> Cc: stable at vger.kernel.org
> 
> How about something like the attached patch rather than these two
> patches?  It's a bit bigger but seems cleaner and more defensive in my
> opinion.
> 

Whenever device goes to suspend/reset and then comes back, remap offset 
has to be set back to 0 to make sure it doesn't use the wrong offset 
when the register assumes default values again.

To avoid the if-check in hdp_flush (which is more frequent), another way 
is to initialize the remap offset to default offset during early init 
and hw fini/suspend sequences. It won't be obvious (even with this 
patch) as to when remap offset vs default offset is used though.

Thanks,
Lijo

> Alex
> 
>> ---
>> v2:
>>          Take care of IP resume cases (Alex Deucher)
>>          Add NULL check to nbio.funcs to cover older (GFXv8) ASICs (Felix Kuehling)
>>          Add more details in commit message and associate with AER patch (Bjorn
>> Helgaas)
>>
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/nv.c            |  6 ------
>>   drivers/gpu/drm/amd/amdgpu/soc15.c         |  6 ------
>>   drivers/gpu/drm/amd/amdgpu/soc21.c         |  6 ------
>>   4 files changed, 24 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index ce7d117efdb5..e420118769a5 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2334,6 +2334,26 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>>          return 0;
>>   }
>>
>> +/**
>> + * amdgpu_device_prepare_ip - prepare IPs for hardware initialization
>> + *
>> + * @adev: amdgpu_device pointer
>> + *
>> + * Any common hardware initialization sequence that needs to be done before
>> + * hw init of individual IPs is performed here. This is different from the
>> + * 'common block' which initializes a set of IPs.
>> + */
>> +static void amdgpu_device_prepare_ip(struct amdgpu_device *adev)
>> +{
>> +       /* Remap HDP registers to a hole in mmio space, for the purpose
>> +        * of exposing those registers to process space. This needs to be
>> +        * done before hw init of ip blocks to take care of HDP flush
>> +        * operations through registers during hw_init.
>> +        */
>> +       if (adev->nbio.funcs && adev->nbio.funcs->remap_hdp_registers &&
>> +           !amdgpu_sriov_vf(adev))
>> +               adev->nbio.funcs->remap_hdp_registers(adev);
>> +}
>>
>>   /**
>>    * amdgpu_device_ip_init - run init for hardware IPs
>> @@ -2376,6 +2396,8 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>>                                  DRM_ERROR("amdgpu_vram_scratch_init failed %d\n", r);
>>                                  goto init_failed;
>>                          }
>> +
>> +                       amdgpu_device_prepare_ip(adev);
>>                          r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev);
>>                          if (r) {
>>                                  DRM_ERROR("hw_init %d failed %d\n", i, r);
>> @@ -3058,6 +3080,7 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev)
>>                  AMD_IP_BLOCK_TYPE_IH,
>>          };
>>
>> +       amdgpu_device_prepare_ip(adev);
>>          for (i = 0; i < adev->num_ip_blocks; i++) {
>>                  int j;
>>                  struct amdgpu_ip_block *block;
>> @@ -3139,6 +3162,7 @@ static int amdgpu_device_ip_resume_phase1(struct amdgpu_device *adev)
>>   {
>>          int i, r;
>>
>> +       amdgpu_device_prepare_ip(adev);
>>          for (i = 0; i < adev->num_ip_blocks; i++) {
>>                  if (!adev->ip_blocks[i].status.valid || adev->ip_blocks[i].status.hw)
>>                          continue;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
>> index b3fba8dea63c..3ac7fef74277 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
>> @@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
>>          nv_program_aspm(adev);
>>          /* setup nbio registers */
>>          adev->nbio.funcs->init_registers(adev);
>> -       /* remap HDP registers to a hole in mmio space,
>> -        * for the purpose of expose those registers
>> -        * to process space
>> -        */
>> -       if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
>> -               adev->nbio.funcs->remap_hdp_registers(adev);
>>          /* enable the doorbell aperture */
>>          nv_enable_doorbell_aperture(adev, true);
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
>> index fde6154f2009..a0481e37d7cf 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
>> @@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
>>          soc15_program_aspm(adev);
>>          /* setup nbio registers */
>>          adev->nbio.funcs->init_registers(adev);
>> -       /* remap HDP registers to a hole in mmio space,
>> -        * for the purpose of expose those registers
>> -        * to process space
>> -        */
>> -       if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
>> -               adev->nbio.funcs->remap_hdp_registers(adev);
>>
>>          /* enable the doorbell aperture */
>>          soc15_enable_doorbell_aperture(adev, true);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
>> index 55284b24f113..16b447055102 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
>> @@ -660,12 +660,6 @@ static int soc21_common_hw_init(void *handle)
>>          soc21_program_aspm(adev);
>>          /* setup nbio registers */
>>          adev->nbio.funcs->init_registers(adev);
>> -       /* remap HDP registers to a hole in mmio space,
>> -        * for the purpose of expose those registers
>> -        * to process space
>> -        */
>> -       if (adev->nbio.funcs->remap_hdp_registers)
>> -               adev->nbio.funcs->remap_hdp_registers(adev);
>>          /* enable the doorbell aperture */
>>          soc21_enable_doorbell_aperture(adev, true);
>>
>> --
>> 2.25.1
>>


More information about the amd-gfx mailing list