[PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings

Thu Dec 7 13:06:57 UTC 2023

On 07/12/2023 11:26, Matthew Auld wrote:
> On 06/12/2023 13:34, Badal Nilawar wrote:
>> Block rpm for discrete cards when mmap mappings are active.
>> Ideally rpm wake ref should be taken in vm_open call and put in vm_close
>> call but it is seen that vm_open doesn't get called for xe_gem_vm_ops.
>> Therefore rpm wake ref is being get in xe_drm_gem_ttm_mmap and put
>> in vm_close.
>>
>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> Cc: Anshuman Gupta <anshuman.gupta at intel.com>
>> Signed-off-by: Badal Nilawar <badal.nilawar at intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_bo.c | 35 +++++++++++++++++++++++++++++++++--
>>   1 file changed, 33 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>> index 72dc4a4eed4e..5741948a2a51 100644
>> --- a/drivers/gpu/drm/xe/xe_bo.c
>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>> @@ -15,6 +15,7 @@
>>   #include <drm/ttm/ttm_tt.h>
>>   #include <drm/xe_drm.h>
>> +#include "i915_drv.h"
> 
> Do we need this?
> 
>>   #include "xe_device.h"
>>   #include "xe_dma_buf.h"
>>   #include "xe_drm_client.h"
>> @@ -1158,17 +1159,47 @@ static vm_fault_t xe_gem_fault(struct vm_fault 
>> *vmf)
>>       return ret;
>>   }
>> +static void xe_ttm_bo_vm_close(struct vm_area_struct *vma)
>> +{
>> +    struct ttm_buffer_object *tbo = vma->vm_private_data;
>> +    struct drm_device *ddev = tbo->base.dev;
>> +    struct xe_device *xe = to_xe_device(ddev);
>> +
>> +    ttm_bo_vm_close(vma);
>> +
>> +    if (tbo->resource->bus.is_iomem)
>> +        xe_device_mem_access_put(xe);

Are you sure this works as expected? Say if the user partially unmaps 
something?

map = mmap(obj, size);
unmap(map, size/2);
unmap(map, size);

That would be one mmap but multiple vm_close calls leading to an 
imbalance in the RPM ref. I think we need the access_get in the vm_open 
also?

>> +}
>> +
>>   static const struct vm_operations_struct xe_gem_vm_ops = {
>>       .fault = xe_gem_fault,
>>       .open = ttm_bo_vm_open,
>> -    .close = ttm_bo_vm_close,
>> +    .close = xe_ttm_bo_vm_close,
>>       .access = ttm_bo_vm_access
>>   };
>> +int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem,
>> +            struct vm_area_struct *vma)
>> +{
>> +    struct ttm_buffer_object *tbo = drm_gem_ttm_of_gem(gem);
>> +    struct drm_device *ddev = tbo->base.dev;
>> +    struct xe_device *xe = to_xe_device(ddev);
>> +    int ret;
>> +
>> +    ret = drm_gem_ttm_mmap(gem, vma);
>> +    if (ret < 0)
>> +        return ret;
>> +
>> +    if (tbo->resource->bus.is_iomem)
>> +        xe_device_mem_access_get(xe);
> 
> Checking is_iomem outside of the usual locking is racy. One issue here 
> is that is_iomem can freely change at any point (like at fault time) so 
> when vm_close is called you can easily get an an unbalanced RPM ref 
> count. For example io_mem is false here but later becomes true in 
> bo_vm_close and then we call mem_access_put even though we never called 
> mem_access_get.
> 
> Maybe check the possible placements of the object instead since that is 
> immutable?
> 
>> +
>> +    return 0;
>> +}
>> +
>>   static const struct drm_gem_object_funcs xe_gem_object_funcs = {
>>       .free = xe_gem_object_free,
>>       .close = xe_gem_object_close,
>> -    .mmap = drm_gem_ttm_mmap,
>> +    .mmap = xe_drm_gem_ttm_mmap,
>>       .export = xe_gem_prime_export,
>>       .vm_ops = &xe_gem_vm_ops,
>>   };