[v2] drm/etnaviv: Clear the __GFP_HIGHMEM bit in GFP_HIGHUSER with 32 address

Fri Aug 30 21:03:01 UTC 2024

Hi, Xiaolei

Thanks for your nice catch! I have more to say.

On 2024/8/16 09:55, Wang, Xiaolei wrote:
> Ping ...

32 address -> 32-bit address,

Perhaps, we could improve the commit title a little bit
by writing a more accurate sentence if possible, say:

drm/etnaviv: Properly request pages from DMA32 zone when needed

or

drm/etnaviv: Request pages from DMA32 zone on addressing_limited

> thanks
> xiaolei

Vivante GPU is a 32-bit GPU, it do can access 40-bit physical address via its MMU(IOMMU).
But this is only possible *after* the MMU has been setup(initialized). Before GPU page
table is setup(and flush-ed into the GPU's TLB), the device can only access 32-bit
physical addresses and the addresses has to be physical continues in ranges.

The GPU page tables (GART) and command buffer has to reside in low 4GB address.

> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> index 7c7f97793ddd..0e6bdf2d028b 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> @@ -844,8 +844,10 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
>            * request pages for our SHM backend buffers from the DMA32 zone to
>            * hopefully avoid performance killing SWIOTLB bounce buffering.
>            */
> -       if (dma_addressing_limited(gpu->dev))
> +       if (dma_addressing_limited(gpu->dev)) {
>                   priv->shm_gfp_mask |= GFP_DMA32;
> +               priv->shm_gfp_mask &= ~__GFP_HIGHMEM;
> +       }

The code here  still looks itchy and risky,
because for a i.MX8 SoC with multiple vivante GPU core.
We will modify priv->shm_gfp_mask *multiple* time.

For the 2D core and the 3D core have different DMA addressing constraint.
Then, only the last(latest) modify will be effective. This lead to the
probe order dependent.

However this may not be a problem in practice, as usually, all vivante
GPUs in the system will share the same DMA constraints. And the driver
assume that.

But then, we probably still should not modify the global shared GFP
mask multiple time.

Now that we do assume that all vivante GPUs in the system share the
same DMA constraints. And the DMA constraints information has been
assigned to the virtual master. The right time to modify the
`priv->shm_gfp_mask` should be in the etnaviv_bind() function. as
this can eliminate overlap(repeat) stores.

Please consider move the entire if() {} to etnaviv_bind(), just below
where the 'priv->shm_gfp_mask' was initially initialized.

or alternatively we can just hard-code to use low 4GM memmory only:

priv->shm_gfp_mask = GFP_USER | GFP_DMA32 | __GFP_RETRY_MAYFAIL | __GFP_NOWARN;

Best regards,
Sui

>           /* Create buffer: */
>           ret = etnaviv_cmdbuf_init(priv->cmdbuf_suballoc, &gpu->buffer,