[PATCH v2 02/15] drm/panthor: Add uAPI

Steven Price steven.price at arm.com
Wed Sep 6 10:55:35 UTC 2023


On 04/09/2023 17:25, Robin Murphy wrote:
> On 2023-09-04 17:16, Boris Brezillon wrote:
>> On Mon, 4 Sep 2023 16:22:19 +0100
>> Steven Price <steven.price at arm.com> wrote:
>>
>>> On 04/09/2023 10:26, Boris Brezillon wrote:
>>>> On Mon, 4 Sep 2023 08:42:08 +0100
>>>> Steven Price <steven.price at arm.com> wrote:
>>>>   
>>>>> On 01/09/2023 17:10, Boris Brezillon wrote:
>>>>>> On Wed,  9 Aug 2023 18:53:15 +0200
>>>>>> Boris Brezillon <boris.brezillon at collabora.com> wrote:
>>>>>>     
>>>>>>> +/**
>>>>>>> + * DOC: MMIO regions exposed to userspace.
>>>>>>> + *
>>>>>>> + * .. c:macro:: DRM_PANTHOR_USER_MMIO_OFFSET
>>>>>>> + *
>>>>>>> + * File offset for all MMIO regions being exposed to userspace.
>>>>>>> Don't use
>>>>>>> + * this value directly, use DRM_PANTHOR_USER_<name>_OFFSET
>>>>>>> values instead.
>>>>>>> + *
>>>>>>> + * .. c:macro:: DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET
>>>>>>> + *
>>>>>>> + * File offset for the LATEST_FLUSH_ID register. The Userspace
>>>>>>> driver controls
>>>>>>> + * GPU cache flushling through CS instructions, but the flush
>>>>>>> reduction
>>>>>>> + * mechanism requires a flush_id. This flush_id could be queried
>>>>>>> with an
>>>>>>> + * ioctl, but Arm provides a well-isolated register page
>>>>>>> containing only this
>>>>>>> + * read-only register, so let's expose this page through a
>>>>>>> static mmap offset
>>>>>>> + * and allow direct mapping of this MMIO region so we can avoid the
>>>>>>> + * user <-> kernel round-trip.
>>>>>>> + */
>>>>>>> +#define DRM_PANTHOR_USER_MMIO_OFFSET        (0x1ull << 56)
>>>>>>
>>>>>> I'm playing with a 32-bit kernel/userspace, and this is problematic,
>>>>>> because vm_pgoff is limited to 32-bit there, meaning we can only
>>>>>> map up
>>>>>> to (1ull << (PAGE_SHIFT + 32)) - 1. Should we add a DEV_QUERY to let
>>>>>> userspace set the mmio range?
>>>>>
>>>>> Hmm, I was rather hoping we could ignore 32 bit these days ;) But
>>>>> while
>>>>> I can't see why anyone would be running a 32 bit kernel, I guess 32
>>>>> bit
>>>>> user space is likely to still be needed.
>>>>
>>>> Uh, I just hit a new problem with 32-bit kernels: the io-pgtable
>>>> interface (io_pgtable_ops) passes device VAs as unsigned longs, meaning
>>>> the GPU VA space is limited to 4G on a 32-bit build :-(. Robin, any
>>>> chance you could advise me on what to do here?
>>>>
>>>> 1. assume this limitation is here for a good reason, and limit the GPU
>>>> VA space to 32-bits on 32-bit kernels
>>>>
>>>> or
>>>>
>>>> 2. update the interface to make iova an u64
>>>
>>> I'm not sure I can answer the question from a technical perspective,
>>> hopefully Robin will be able to.
>>
>> Had a quick chat with Robin, and he's recommending going for #1 too.
>>
>>>
>>> But why do we care about 32-bit kernels on a platform which is new
>>> enough to have a CSF-GPU (and by extension a recent 64-bit CPU)?
>>
>> Apparently the memory you save by switching to a 32-bit kernel matters
>> to some people. To clarify, the CPU is aarch64, but they want to use it
>> in 32-bit mode.
>>
>>>
>>> Given the other limitations present in a 32-bit kernel I'd be tempted to
>>> say '1' just for simplicity. Especially since apparently we've lived
>>> with this for panfrost which presumably has the same limitation (even
>>> though all Bifrost/Midgard GPUs have at least 33 bits of VA space).
>>
>> Well, Panfrost is simpler in that you don't have this kernel VA range,
>> and, IIRC, we are using the old format that naturally limits the GPU VA
>> space to 4G.
> 
> FWIW the legacy pagetable format itself should be fine going up to
> however many bits the GPU supports, however there were various ISA
> limitations around crossing 4GB boundaries, and the easiest way to avoid
> having to think about those was to just not use more than 4GB of VA at
> all (minus chunks at the ends for similar weird ISA reasons).

Exactly right. The legacy pagetable format supports the full range of
VA. Indeed kbase used the legacy format for a long time.

However the ISA has special handling for addresses where bits 31:12 are
all 0 or all 1, so we have to avoid executable code landing on these
regions. kbase has a modified version of 'unmapped_area_topdown'[1] to
handle these additional restrictions.

Steve

[1] kbase_unmapped_area_topdown()
https://android.googlesource.com/kernel/google-modules/gpu/+/refs/tags/android-12.0.0_r0.42/mali_kbase/thirdparty/mali_kbase_mmap.c#97


More information about the dri-devel mailing list