[PATCH] drm/amdgpu: Add preferred_domain check when determine XGMI state

Liu, Shaoyun Shaoyun.Liu at amd.com
Wed Mar 27 16:24:21 UTC 2019


Even the preferred_domains been changed for bo that original is none xgmi here , it doesn't means driver must change the XGMI status for the bo_va. From current design, only the bo later been added to the  compute VM (through bo_add) will be enable the XGMI mapping. Without that call , the XGMI mapping will be disabled.
Remove the XGMI checking may affect other sharing mechanism.

Shaoyun.liu
  

On 2019-03-27 11:52 a.m., Kuehling, Felix wrote:
> On 2019-03-26 4:35 p.m., Liu, Shaoyun wrote:
>> Avoid unnecessary XGMI hight pstate trigger when mapping none-vram memory for peer device
>>
>> Change-Id: I1881deff3da19f1f4b58d5765db03a590092a5b2
>> Signed-off-by: shaoyunl <shaoyun.liu at amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 11 +++++++++++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  3 ++-
>>    2 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> index a82c3b1..a0f56e4 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> @@ -666,6 +666,8 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data,
>>    	struct amdgpu_device *adev = dev->dev_private;
>>    	struct drm_amdgpu_gem_op *args = data;
>>    	struct drm_gem_object *gobj;
>> +	struct amdgpu_vm_bo_base *base;
>> +	struct amdgpu_bo_va *bo_va;
>>    	struct amdgpu_bo *robj;
>>    	int r;
>>    
>> @@ -704,6 +706,15 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data,
>>    			amdgpu_bo_unreserve(robj);
>>    			break;
>>    		}
>> +		for (base = robj->vm_bo; base; base = base->next) {
>> +			bo_va = container_of(base, struct amdgpu_bo_va, base);
>> +			if (bo_va && bo_va->is_xgmi) {
> The point here is to prevent transitions where a mapping moves from
> is_xgmi => !is_xgmi or from !is_xgmi => is_xgmi, because that would
> potentially require a XGMI pstate change.
>
> This check catches the case of is_xgmi => !is_xgmi, but not the other
> way around. I think you should drop the is_xgmi condition here. Any BO
> that's shared between multiple GPUs should return -EINVAL here because
> moving it could result in an XGMI pstate change. It doesn't matter
> whether it's currently using XGMI or not.
>
> Regards,
>     Felix
>
>> +				r = -EINVAL;
>> +				amdgpu_bo_unreserve(robj);
>> +				goto out;
>> +			}
>> +		}
>> +
>>    		robj->preferred_domains = args->value & (AMDGPU_GEM_DOMAIN_VRAM |
>>    							AMDGPU_GEM_DOMAIN_GTT |
>>    							AMDGPU_GEM_DOMAIN_CPU);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 76eee7e..8ed23d2 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -2048,7 +2048,8 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct amdgpu_device *adev,
>>    	INIT_LIST_HEAD(&bo_va->valids);
>>    	INIT_LIST_HEAD(&bo_va->invalids);
>>    
>> -	if (bo && amdgpu_xgmi_same_hive(adev, amdgpu_ttm_adev(bo->tbo.bdev))) {
>> +	if (bo && amdgpu_xgmi_same_hive(adev, amdgpu_ttm_adev(bo->tbo.bdev)) &&
>> +	    (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM)) {
>>    		bo_va->is_xgmi = true;
>>    		mutex_lock(&adev->vm_manager.lock_pstate);
>>    		/* Power up XGMI if it can be potentially used */


More information about the amd-gfx mailing list