[PATCH 1/5] drm/ttm: add a weak BO reference to the resource v2

Christian König ckoenig.leichtzumerken at gmail.com
Mon Aug 23 10:02:44 UTC 2021


Hi Thomas,

Am 23.08.21 um 09:56 schrieb Thomas Hellström:
> Hi, Christian,
>
> Still working through some backlog and this series appears to have
> slipped under the radar.
>
> Still I think a cover letter would benefit reviewers...
>
> On Mon, 2021-07-19 at 13:51 +0200, Christian König wrote:
>> Keep track for which BO a resource was allocated.
>> This is necessary to move the LRU handling into the resources.
> So is this needed, when looking up a resource from the LRU, to find the
> bo the resource is currently attached to?

yes, correct. The main motivation is to fix resource handling during 
allocations and moves.

Essentially we currently have the following steps during resource 
allocation:
1. Locking the BO.
2. Figuring out we need resources.
2. Allocating the resource.
3. Moving the BO to the correct LRU.
4. Unlocking the BO.

The problem is now that we have a short time where we allocated the 
resource, but can't evict it again. In other words we don't know to 
which object a resource belongs.

Buffer moves are even worse since we have an old resource as well as a 
new one at the same time and so potentially would need the BO on two 
LRUs at the same time.

This has caused a whole bunch of problems in the past because we ran 
into out of resource situations and implemented quite a number of 
workarounds for this.

>> A bit problematic is i915 since it tries to use the resource
>> interface without a BO which is illegal from the conceptional
> s/conceptional/conceptual/ ?
>> point of view.
>>
>> v2: Document that this is a weak reference and add a workaround for
>> i915
> Hmm. As pointed out in my previous review the weak pointer should
> really be cleared under a lookup lock to avoid use-after-free bugs.
> I don't see that happening here. Doesn't this series aim for a future
> direction of early destruction of bos and ditching the ghost objects?
> In that case IMHO you need to allow for a NULL bo pointer and any bo
> information needed at resource destruction needs to be copied on the
> resource... (That would also ofc help with the i915 problem).

Yes, I'm going back and force how to do this cleanly as well.

Ok going to give the NULL pointer a try.

>
>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> ---
>>   drivers/gpu/drm/i915/intel_region_ttm.c | 5 +++++
>>   drivers/gpu/drm/ttm/ttm_bo_util.c       | 7 +++++--
>>   drivers/gpu/drm/ttm/ttm_resource.c      | 1 +
>>   include/drm/ttm/ttm_resource.h          | 2 ++
>>   4 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c
>> b/drivers/gpu/drm/i915/intel_region_ttm.c
>> index 27fe0668d094..980b10a7debf 100644
>> --- a/drivers/gpu/drm/i915/intel_region_ttm.c
>> +++ b/drivers/gpu/drm/i915/intel_region_ttm.c
>> @@ -88,6 +88,7 @@ intel_region_ttm_node_reserve(struct
>> intel_memory_region *mem,
>>          place.fpfn = offset >> PAGE_SHIFT;
>>          place.lpfn = place.fpfn + (size >> PAGE_SHIFT);
>>          mock_bo.base.size = size;
>> +       mock_bo.bdev = &mem->i915->bdev;
>>          ret = man->func->alloc(man, &mock_bo, &place, &res);
>>          if (ret == -ENOSPC)
>>                  ret = -ENXIO;
>> @@ -104,7 +105,11 @@ void intel_region_ttm_node_free(struct
>> intel_memory_region *mem,
>>                                  struct ttm_resource *res)
>>   {
>>          struct ttm_resource_manager *man = mem->region_private;
>> +       struct ttm_buffer_object mock_bo = {};
>>   
>> +       mock_bo.base.size = res->num_pages * PAGE_SIZE;
>> +       mock_bo.bdev = &mem->i915->bdev;
>> +       res->bo = &mock_bo;
>>          man->func->free(man, res);
>>   }
>>   
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> index 2f57f824e6db..a1570aa8ff56 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> @@ -239,6 +239,11 @@ static int ttm_buffer_object_transfer(struct
>> ttm_buffer_object *bo,
>>          if (bo->type != ttm_bo_type_sg)
>>                  fbo->base.base.resv = &fbo->base.base._resv;
>>   
>> +       if (fbo->base.resource) {
>> +               fbo->base.resource->bo = &fbo->base;
> This is scary: What if someone else (LRU lookup) just dereferenced the
> previous resource->bo pointer? I think this needs proper weak reference
> locking and lifetime handling, as mentioned above.

Ah, yes good point. Missed this occasion.

Thanks,
Christian.

>
>
>> +               bo->resource = NULL;
>> +       }
>> +
> /Thomas
>
>



More information about the dri-devel mailing list