[PATCH v2 3/4] drm/vc4: Remove BOs seqnos

Mon Dec 16 23:42:44 UTC 2024

Hi Melissa,

Thanks for your review!

On 16/12/24 17:25, Melissa Wen wrote:
> On 12/12, Maíra Canal wrote:
>> `bo->seqno`, `bo->write_seqno`, and `exec->bin_dep_seqno` are leftovers
>> from a time when VC4 didn't support DMA Reservation Objects. When they
>> were introduced, it made sense to think about tracking the correspondence
>> between the BOs and the jobs through the job's seqno.
>>
>> This is no longer needed, as VC4 already has support for DMA Reservation
>> Objects and attaches the "job done" fence to the BOs.
>>
>> Signed-off-by: Maíra Canal <mcanal at igalia.com>
>> ---
>>   drivers/gpu/drm/vc4/vc4_crtc.c     | 22 +++++++++++-----------
>>   drivers/gpu/drm/vc4/vc4_drv.h      | 13 -------------
>>   drivers/gpu/drm/vc4/vc4_gem.c      | 18 ++----------------
>>   drivers/gpu/drm/vc4/vc4_validate.c | 11 -----------
>>   4 files changed, 13 insertions(+), 51 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
>> index cf40a53ad42e..3a5b5743cb2f 100644
>> --- a/drivers/gpu/drm/vc4/vc4_crtc.c
>> +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
>> @@ -919,10 +919,11 @@ vc4_async_page_flip_complete(struct vc4_async_flip_state *flip_state)
>>   	kfree(flip_state);
>>   }
>>   
>> -static void vc4_async_page_flip_seqno_complete(struct vc4_seqno_cb *cb)
>> +static void vc4_async_page_flip_seqno_complete(struct dma_fence *fence,
>> +					       struct dma_fence_cb *cb)
>>   {
>>   	struct vc4_async_flip_state *flip_state =
>> -		container_of(cb, struct vc4_async_flip_state, cb.seqno);
>> +		container_of(cb, struct vc4_async_flip_state, cb.fence);
> 
> hmm... I didn't get why you still call this function
> vc4_async_page_flip_seqno_complete if you are not using seqno anymore.

I'll rename it. Thanks!

> 
> On top of that, can we just use vc4_async_page_flip_fence_complete
> instead? I mean, looks like we don't need two different async page flip
> paths according to the hardware anymore.

`vc4_async_page_flip_seqno_complete()` updates the BO usecnt, which is
needed for VC4 (GEN == 4). Maxime, Dave, could you confirm that this
usecnt logic is still needed?

> 
>>   	struct vc4_bo *bo = NULL;
>>   
>>   	if (flip_state->old_fb) {
>> @@ -961,16 +962,15 @@ static int vc4_async_set_fence_cb(struct drm_device *dev,
>>   {
>>   	struct drm_framebuffer *fb = flip_state->fb;
>>   	struct drm_gem_dma_object *dma_bo = drm_fb_dma_get_gem_obj(fb, 0);
>> +	dma_fence_func_t async_page_flip_complete_function;
>>   	struct vc4_dev *vc4 = to_vc4_dev(dev);
>>   	struct dma_fence *fence;
>>   	int ret;
>>   
>> -	if (vc4->gen == VC4_GEN_4) {
>> -		struct vc4_bo *bo = to_vc4_bo(&dma_bo->base);
>> -
>> -		return vc4_queue_seqno_cb(dev, &flip_state->cb.seqno, bo->seqno,
>> -					  vc4_async_page_flip_seqno_complete);
>> -	}
>> +	if (vc4->gen == VC4_GEN_4)
>> +		async_page_flip_complete_function = vc4_async_page_flip_seqno_complete;
>> +	else
>> +		async_page_flip_complete_function = vc4_async_page_flip_fence_complete;
>>   
>>   	ret = dma_resv_get_singleton(dma_bo->base.resv, DMA_RESV_USAGE_READ, &fence);
>>   	if (ret)
>> @@ -978,14 +978,14 @@ static int vc4_async_set_fence_cb(struct drm_device *dev,
>>   
>>   	/* If there's no fence, complete the page flip immediately */
>>   	if (!fence) {
>> -		vc4_async_page_flip_fence_complete(fence, &flip_state->cb.fence);
>> +		async_page_flip_complete_function(fence, &flip_state->cb.fence);
>>   		return 0;
>>   	}
>>   
>>   	/* If the fence has already been completed, complete the page flip */
>>   	if (dma_fence_add_callback(fence, &flip_state->cb.fence,
>> -				   vc4_async_page_flip_fence_complete))
>> -		vc4_async_page_flip_fence_complete(fence, &flip_state->cb.fence);
>> +				   async_page_flip_complete_function))
>> +		async_page_flip_complete_function(fence, &flip_state->cb.fence);
>>   
>>   	return 0;
>>   }

[...]

>> diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
>> index 4037c65eb269..1cbd95c4f9ef 100644
>> --- a/drivers/gpu/drm/vc4/vc4_gem.c
>> +++ b/drivers/gpu/drm/vc4/vc4_gem.c

[...]

>> @@ -845,12 +837,6 @@ vc4_get_bcl(struct drm_device *dev, struct vc4_exec_info *exec)
>>   			goto fail;
>>   	}
>>   
>> -	/* Block waiting on any previous rendering into the CS's VBO,
>> -	 * IB, or textures, so that pixels are actually written by the
>> -	 * time we try to read them.
>> -	 */
>> -	ret = vc4_wait_for_seqno(dev, exec->bin_dep_seqno, ~0ull, true);
> 
> Don't we still need waiting for fences here?
> 

As we indicate the implicit DMA reservation usage through the `enum
dma_resv_usage` flag, I believe we can be safe to assume that the BOs
won't be read before we finish writing.

Best Regards,
- Maíra

>> -- 
>> 2.47.1
>>