[PATCH] drm/amdgpu: Enable SA software trap.

Felix Kuehling felix.kuehling at amd.com
Thu Sep 22 17:14:13 UTC 2022


Am 2022-09-22 um 12:17 schrieb David Belanger:
> Enables support for software trap for MES >= 4.
> Adapted from implementation from Jay Cornwall.
>
> v2: Add IP version check in conditions.
>
> Signed-off-by: Jay Cornwall <Jay.Cornwall at amd.com>
> Signed-off-by: David Belanger <david.belanger at amd.com>
> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/mes_v11_0.c        |   6 +-
>   .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h    | 771 +++++++++---------
>   .../amd/amdkfd/cwsr_trap_handler_gfx10.asm    |  21 +
>   .../gpu/drm/amd/amdkfd/kfd_int_process_v11.c  |  26 +-
>   4 files changed, 437 insertions(+), 387 deletions(-)
[snip]
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
> index a6fcbeeb7428..4e03d19e9333 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
> @@ -358,13 +358,35 @@ static void event_interrupt_wq_v11(struct kfd_dev *dev,
>   				break;
>   			case SQ_INTERRUPT_WORD_ENCODING_ERROR:
>   				print_sq_intr_info_error(context_id0, context_id1);
> +				sq_int_priv = REG_GET_FIELD(context_id0,
> +						SQ_INTERRUPT_WORD_WAVE_CTXID0, PRIV);
>   				sq_int_errtype = REG_GET_FIELD(context_id0,
>   						SQ_INTERRUPT_WORD_ERROR_CTXID0, TYPE);
> -				if (sq_int_errtype != SQ_INTERRUPT_ERROR_TYPE_ILLEGAL_INST &&
> -				    sq_int_errtype != SQ_INTERRUPT_ERROR_TYPE_MEMVIOL) {
> +
> +				switch (sq_int_errtype) {
> +				case SQ_INTERRUPT_ERROR_TYPE_EDC_FUE:
> +				case SQ_INTERRUPT_ERROR_TYPE_EDC_FED:
>   					event_interrupt_poison_consumption_v11(
>   							dev, pasid, source_id);
>   					return;
> +				case SQ_INTERRUPT_ERROR_TYPE_ILLEGAL_INST:
> +					/*if (!(((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 4) &&
> +						  (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(11, 0, 0)) &&
> +						  (adev->ip_versions[GC_HWIP][0] <= IP_VERSION(11, 0, 3)))
> +						&& sq_int_priv)
> +						kfd_set_dbg_ev_from_interrupt(dev, pasid, -1,
> +							KFD_EC_MASK(EC_QUEUE_WAVE_ILLEGAL_INSTRUCTION),
> +							NULL, 0);*/
> +					return;
> +				case SQ_INTERRUPT_ERROR_TYPE_MEMVIOL:
> +					/*if (!(((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 4) &&
> +						  (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(11, 0, 0)) &&
> +						  (adev->ip_versions[GC_HWIP][0] <= IP_VERSION(11, 0, 3)))
> +						&& sq_int_priv)
> +						kfd_set_dbg_ev_from_interrupt(dev, pasid, -1,
> +							KFD_EC_MASK(EC_QUEUE_WAVE_MEMORY_VIOLATION),
> +							NULL, 0);*/

Which branch is this for? kfd_set_dbg_ev_from_interrupt shouldn't exist 
on the upstream branch yet. That code is still under review for upstream.

Regards,
   Felix


> +					return;
>   				}
>   				break;
>   			default:


More information about the amd-gfx mailing list