[PATCH v5 01/10] drm/v3d: Fix the MMU flush order
Maíra Canal
mcanal at igalia.com
Wed Sep 18 11:54:40 UTC 2024
Hi Iago,
On 9/4/24 05:24, Iago Toral wrote:
> Hi Maira,
>
> El jue, 29-08-2024 a las 10:05 -0300, Maíra Canal escribió:
>> We must first flush the MMU cache and then, flush the TLB, not the
>> other
>> way around. Currently, we can see a race condition between the MMU
>> cache
>> and the TLB when running multiple rendering processes at the same
>> time.
>> This is evidenced by MMU errors triggered by the IRQ.
>>
>> Fix the MMU flush order by flushing the MMU cache and then the TLB.
>
> the patch is making 2 changes, it is changing the ordering of the
> flushes but also the fact that now we wait for the first flush to
> commplete before starting the second while the previous version would
> start both flushes and then wait for both to complete. The commit log
> seems to suggest that the first change is the one that fixes the issue
> but I wonder if that is really what is happening.
>
> Also, have you tested keeping the original order of operations but with
> interleaved waits like we do here? Either way, I think we probably
I just tested keeping the order of the operations but using interleaved
waits and I end up having MMU errors that crash the GPU. Changing the
order of the operations was a deliberated choice due the GPU design and
and waiting for each of those to finish is needed to avoid race
conditions.
> should emphasized more in the committ log the fact that we are now
> waiting for each flush to complete before starting the other flush.
>
But you are absolutely right. I need to mention it on the commit log.
I'll add this information in the next version.
>
>>
>> Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for
>> Broadcom V3D V3.x+")
>> Signed-off-by: Maíra Canal <mcanal at igalia.com>
>> ---
>> drivers/gpu/drm/v3d/v3d_mmu.c | 29 ++++++++++-------------------
>> 1 file changed, 10 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/v3d/v3d_mmu.c
>> b/drivers/gpu/drm/v3d/v3d_mmu.c
>> index 14f3af40d6f6..06bb44c7f35d 100644
>> --- a/drivers/gpu/drm/v3d/v3d_mmu.c
>> +++ b/drivers/gpu/drm/v3d/v3d_mmu.c
>> @@ -32,32 +32,23 @@ static int v3d_mmu_flush_all(struct v3d_dev *v3d)
>> {
>> int ret;
>>
>> - /* Make sure that another flush isn't already running when
>> we
>> - * start this one.
>> - */
>> - ret = wait_for(!(V3D_READ(V3D_MMU_CTL) &
>> - V3D_MMU_CTL_TLB_CLEARING), 100);
>> - if (ret)
>> - dev_err(v3d->drm.dev, "TLB clear wait idle pre-wait
>> failed\n");
>> -
>
> are we certain we can't have a flush in flux when a new one comes in?
We can, but it doesn't really matter, as by the end of both requests,
the TLB will be flushed.
Best Regards,
- Maíra
>
>> - V3D_WRITE(V3D_MMU_CTL, V3D_READ(V3D_MMU_CTL) |
>> - V3D_MMU_CTL_TLB_CLEAR);
>> -
>> - V3D_WRITE(V3D_MMUC_CONTROL,
>> - V3D_MMUC_CONTROL_FLUSH |
>> + V3D_WRITE(V3D_MMUC_CONTROL, V3D_MMUC_CONTROL_FLUSH |
>> V3D_MMUC_CONTROL_ENABLE);
>>
>> - ret = wait_for(!(V3D_READ(V3D_MMU_CTL) &
>> - V3D_MMU_CTL_TLB_CLEARING), 100);
>> + ret = wait_for(!(V3D_READ(V3D_MMUC_CONTROL) &
>> + V3D_MMUC_CONTROL_FLUSHING), 100);
>> if (ret) {
>> - dev_err(v3d->drm.dev, "TLB clear wait idle
>> failed\n");
>> + dev_err(v3d->drm.dev, "MMUC flush wait idle
>> failed\n");
>> return ret;
>> }
>>
>> - ret = wait_for(!(V3D_READ(V3D_MMUC_CONTROL) &
>> - V3D_MMUC_CONTROL_FLUSHING), 100);
>> + V3D_WRITE(V3D_MMU_CTL, V3D_READ(V3D_MMU_CTL) |
>> + V3D_MMU_CTL_TLB_CLEAR);
>> +
>> + ret = wait_for(!(V3D_READ(V3D_MMU_CTL) &
>> + V3D_MMU_CTL_TLB_CLEARING), 100);
>> if (ret)
>> - dev_err(v3d->drm.dev, "MMUC flush wait idle
>> failed\n");
>> + dev_err(v3d->drm.dev, "TLB clear wait idle
>> failed\n");
>
> I'd maybe use "MMU TLB clear wait idle failed", so we can more easily
> identify this message comes from the MMU implementation.
>
> Iago
>
>>
>> return ret;
>> }
>
More information about the dri-devel
mailing list