[PATCH 2/2] drm/amdgpu: enable only one compute queue for raven

Fri Oct 16 14:33:55 UTC 2020

On 10/16/20 3:56 PM, Alex Deucher wrote:
> On Wed, Oct 14, 2020 at 9:53 AM Nirmoy Das <nirmoy.das at amd.com> wrote:
>> Because of firmware bug, Raven asics can't handle jobs
>> scheduled to multiple compute queues. So enable only one
>> compute queue till we have a firmware fix.
>>
>> Signed-off-by: Nirmoy Das <nirmoy.das at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  4 ++++
>>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 11 ++++++++++-
>>   2 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> index 8c9bacfdbc30..ca2ac985b300 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> @@ -195,6 +195,10 @@ static bool amdgpu_gfx_is_multipipe_capable(struct amdgpu_device *adev)
>>   bool amdgpu_gfx_is_high_priority_compute_queue(struct amdgpu_device *adev,
>>                                                 int queue)
>>   {
>> +       /* We only enable one compute queue for Raven */
>> +       if (adev->asic_type == CHIP_RAVEN)
>> +               return false;
>> +
>>          /* Policy: make queue 0 of each pipe as high priority compute queue */
>>          return (queue == 0);
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> index 0d8e203b10ef..f3fc9ad8bc20 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> @@ -4633,7 +4633,16 @@ static int gfx_v9_0_early_init(void *handle)
>>                  adev->gfx.num_gfx_rings = 0;
>>          else
>>                  adev->gfx.num_gfx_rings = GFX9_NUM_GFX_RINGS;
>> -       adev->gfx.num_compute_rings = amdgpu_num_kcq;
>> +
>> +       /* raven firmware currently can not load balance jobs
>> +        * among multiple compute queues. Enable only one
>> +        * compute queue till we have a firmware fix.
>> +        */
>> +       if (adev->asic_type == CHIP_RAVEN)
>> +               adev->gfx.num_compute_rings = 1;
>> +       else
>> +               adev->gfx.num_compute_rings = amdgpu_num_kcq;
>> +
> I would suggest something like this instead so we can override easily
> for testing:
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index abddcd9dab3d..a2954b41e59d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1376,6 +1376,12 @@ static int amdgpu_device_check_arguments(struct
> amdgpu_device *adev)
>
>          if (amdgpu_num_kcq == -1) {
>                  amdgpu_num_kcq = 8;
> +               /* raven firmware currently can not load balance jobs
> +                * among multiple compute queues. Enable only one
> +                * compute queue till we have a firmware fix.
> +                */
> +               if (adev->asic_type == CHIP_RAVEN)
> +                       amdgpu_num_kcq = 1;
>          } else if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
>                  amdgpu_num_kcq = 8;
>                  dev_warn(adev->dev, "set kernel compute queue number
> to 8 due to invalid parameter provided by user\n");
>

Thanks, this looks much better,

I will update.

Nirmoy

> Alex
>
>>          gfx_v9_0_set_kiq_pm4_funcs(adev);
>>          gfx_v9_0_set_ring_funcs(adev);
>>          gfx_v9_0_set_irq_funcs(adev);
>> --
>> 2.28.0
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cnirmoy.das%40amd.com%7Cc3012ca19bf149cb880608d871db5494%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637384534119165172%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Rcd6aMUMxxvDcwi695IYNvvhHfpKAq74KAOT9Vpzvmo%3D&reserved=0