[PATCH 1/2] drm/amdgpu: Implement instance ID remapping for harvested SDMA engines

Lazar, Lijo lijo.lazar at amd.com
Wed Jun 11 06:14:53 UTC 2025



On 6/11/2025 11:26 AM, Jesse Zhang wrote:
> Adds logic to handle instance ID conversion during SDMA engine reset
> when harvest_config is active. This ensures correct physical engine
> addressing when some SDMA instances are harvested.
> 
> Changes include:
> 1. Added instance ID remapping using GET_INST macro when harvest_config
>    is non-zero
> 2. Conversion happens before engine reset procedure begins
> 3. Maintains existing reset flow for non-harvested configurations
> 
> This fixes hardware initialization issues on devices with harvested
> SDMA instances where the logical instance IDs don't match physical
> hardware mapping.
> 

This shouldn't be required. Without harvest-awareness, driver won't load
properly on MI308.

Thanks,
Lijo

> Suggested-by: Jonathan Kim <jonathan.kim at amd.com>
> Signed-off-by: Jesse Zhang <Jesse.Zhang at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c      | 3 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h      | 1 +
>  3 files changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> index a0e9bf9b2710..4282f60a0cef 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> @@ -759,6 +759,7 @@ static void amdgpu_discovery_read_from_harvest_table(struct amdgpu_device *adev,
>  				~(1U << harvest_info->list[i].number_instance);
>  			break;
>  		case SDMA0_HWID:
> +			adev->sdma.harvest_config |= (1U << harvest_info->list[i].number_instance);
>  			adev->sdma.sdma_mask &=
>  				~(1U << harvest_info->list[i].number_instance);
>  			break;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> index 6716ac281c49..0bfd2c138d24 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> @@ -581,6 +581,9 @@ int amdgpu_sdma_reset_engine(struct amdgpu_device *adev, uint32_t instance_id)
>  	bool gfx_sched_stopped = false, page_sched_stopped = false;
>  
>  	mutex_lock(&sdma_instance->engine_reset_mutex);
> +
> +	if (adev->sdma.harvest_config)
> +		instance_id = GET_INST(SDMA0, instance_id);
>  	/* Stop the scheduler's work queue for the GFX and page rings if they are running.
>  	* This ensures that no new tasks are submitted to the queues while
>  	* the reset is in progress.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> index e5f8951bbb6f..fed00854a1a2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> @@ -123,6 +123,7 @@ struct amdgpu_sdma {
>  
>  	int			num_instances;
>  	uint32_t 		sdma_mask;
> +	uint32_t		harvest_config;
>  	int			num_inst_per_aid;
>  	uint32_t                    srbm_soft_reset;
>  	bool			has_page_queue;



More information about the amd-gfx mailing list