[PATCH 3/5] drm/panfrost: Add a no execute flag for BO allocations
Steven Price
steven.price at arm.com
Mon Jul 22 12:19:58 UTC 2019
On 22/07/2019 13:09, Robin Murphy wrote:
> On 22/07/2019 11:07, Steven Price wrote:
>> On 22/07/2019 10:50, Robin Murphy wrote:
>>> On 19/07/2019 23:07, Rob Herring wrote:
>>>> On Fri, Jul 19, 2019 at 4:39 AM Steven Price <steven.price at arm.com>
>>>> wrote:
>>>>>
>>>>> On 18/07/2019 18:03, Rob Herring wrote:
>>>>>> On Thu, Jul 18, 2019 at 9:03 AM Steven Price <steven.price at arm.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> On 17/07/2019 19:33, Rob Herring wrote:
>>>>>>>> Executable buffers have an alignment restriction that they can't
>>>>>>>> cross
>>>>>>>> 16MB boundary as the GPU program counter is 24-bits. This
>>>>>>>> restriction is
>>>>>>>> currently not handled and we just get lucky. As current userspace
>>>>>>>
>>>>>>> Actually it depends which GPU you are using - some do have a bigger
>>>>>>> program counter - so it might be the choice of GPU to test on rather
>>>>>>> than just luck. But kbase always assumes the worst case (24 bit)
>>>>>>> as in
>>>>>>> practise that's enough.
>>>>>>>
>>>>>>>> assumes all BOs are executable, that has to remain the default. So
>>>>>>>> add a
>>>>>>>> new PANFROST_BO_NOEXEC flag to allow userspace to indicate which
>>>>>>>> BOs are
>>>>>>>> not executable.
>>>>>>>>
>>>>>>>> There is also a restriction that executable buffers cannot start
>>>>>>>> or end
>>>>>>>> on a 4GB boundary. This is mostly avoided as there is only 4GB of
>>>>>>>> space
>>>>>>>> currently and the beginning is already blocked out for NULL ptr
>>>>>>>> detection. Add support to handle this restriction fully
>>>>>>>> regardless of
>>>>>>>> the current constraints.
>>>>>>>>
>>>>>>>> For existing userspace, all created BOs remain executable, but the
>>>>>>>> GPU
>>>>>>>> VA alignment will be increased to the size of the BO. This
>>>>>>>> shouldn't
>>>>>>>> matter as there is plenty of GPU VA space.
>>>>>>>>
>>>>>>>> Cc: Tomeu Vizoso <tomeu.vizoso at collabora.com>
>>>>>>>> Cc: Boris Brezillon <boris.brezillon at collabora.com>
>>>>>>>> Cc: Robin Murphy <robin.murphy at arm.com>
>>>>>>>> Cc: Steven Price <steven.price at arm.com>
>>>>>>>> Cc: Alyssa Rosenzweig <alyssa at rosenzweig.io>
>>>>>>>> Signed-off-by: Rob Herring <robh at kernel.org>
>>>>>>>> ---
>>>>>>>> drivers/gpu/drm/panfrost/panfrost_drv.c | 20
>>>>>>>> +++++++++++++++++++-
>>>>>>>> drivers/gpu/drm/panfrost/panfrost_gem.c | 18 ++++++++++++++++--
>>>>>>>> drivers/gpu/drm/panfrost/panfrost_gem.h | 3 ++-
>>>>>>>> drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
>>>>>>>> include/uapi/drm/panfrost_drm.h | 2 ++
>>>>>>>> 5 files changed, 42 insertions(+), 4 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c
>>>>>>>> b/drivers/gpu/drm/panfrost/panfrost_drv.c
>>>>>>>> index d354b92964d5..b91e991bc6a3 100644
>>>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
>>>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
>>>>>>>> @@ -49,7 +49,8 @@ static int panfrost_ioctl_create_bo(struct
>>>>>>>> drm_device *dev, void *data,
>>>>>>>> struct panfrost_gem_object *bo;
>>>>>>>> struct drm_panfrost_create_bo *args = data;
>>>>>>>>
>>>>>>>> - if (!args->size || args->flags || args->pad)
>>>>>>>> + if (!args->size || args->pad ||
>>>>>>>> + (args->flags & ~PANFROST_BO_NOEXEC))
>>>>>>>> return -EINVAL;
>>>>>>>>
>>>>>>>> bo = panfrost_gem_create_with_handle(file, dev, args->size,
>>>>>>>> args->flags,
>>>>>>>> @@ -367,6 +368,22 @@ static struct drm_driver
>>>>>>>> panfrost_drm_driver = {
>>>>>>>> .gem_prime_mmap = drm_gem_prime_mmap,
>>>>>>>> };
>>>>>>>>
>>>>>>>> +#define PFN_4G_MASK ((SZ_4G - 1) >> PAGE_SHIFT)
>>>>>>>> +
>>>>>>>> +static void panfrost_drm_mm_color_adjust(const struct drm_mm_node
>>>>>>>> *node,
>>>>>>>> + unsigned long color,
>>>>>>>> + u64 *start, u64 *end)
>>>>>>>> +{
>>>>>>>> + /* Executable buffers can't start or end on a 4GB boundary */
>>>>>>>> + if (!color) {
>>>>>>>> + if ((*start & PFN_4G_MASK) == 0)
>>>>>>>> + (*start)++;
>>>>>>>> +
>>>>>>>> + if ((*end & PFN_4G_MASK) == 0)
>>>>>>>> + (*end)--;
>>>>>>>> + }
>>>>>>>> +}
>>>>>>>
>>>>>>> Unless I'm mistaken this won't actually provide the guarantee if the
>>>>>>> memory region is >4GB (which admittedly it isn't at the moment). For
>>>>>>> example a 8GB region would have the beginning/end trimmed off, but
>>>>>>> there's another 4GB in the middle which could be allocated next to.
>>>>>>
>>>>>> Humm, other than not allowing sizes greater than 4G-(2*PAGE_SIZE),
>>>>>> I'm
>>>>>> not sure how we solve that. I guess avoiding sizes of (n*4G -
>>>>>> PAGE_SIZE) would avoid the problem. Seems like almost 4GB executable
>>>>>> buffer should be enough for anyone(TM).
>>>>>
>>>>> I was thinking of something like:
>>>>>
>>>>> if ((*end & ~PFN_4G_MASK) != (*start & ~PFN_4G_MASK)) {
>>>>> if ((*start & PFN_4G_MASK) +
>>>>> (SZ_16M >> PAGE_SHIFT) < PFN_4G_MASK)
>>>>> *end = (*start & ~PFN_4G_MASK) +
>>>>> (SZ_4GB - SZ_16M) >> PAGE_SHIFT;
>>>>> else
>>>>> *start = (*end & ~PFN_4G_MASK) + (SZ_16M) >>
>>>>> PAGE_SHIFT;
>>>>> }
>>>>>
>>>>> So split the region depending on where we can find a free 16MB region
>>>>> which doesn't cross 4GB.
>>>>
>>>> Here's what I ended up with. It's slightly different in that the start
>>>> and end don't get 16MB aligned. The code already takes care of the
>>>> alignment which also is not necessarily 16MB, but 'align = size' as
>>>> that's sufficient to not cross a 16MB boundary.
>>>>
>>>> #define PFN_4G (SZ_4G >> PAGE_SHIFT)
>>>> #define PFN_4G_MASK (PFN_4G - 1)
>>>> #define PFN_16M (SZ_16M >> PAGE_SHIFT)
>>>>
>>>> static void panfrost_drm_mm_color_adjust(const struct drm_mm_node
>>>> *node,
>>>> unsigned long color,
>>>> u64 *start, u64 *end)
>>>> {
>>>> /* Executable buffers can't start or end on a 4GB boundary */
>>>> if (!(color & PANFROST_BO_NOEXEC)) {
>>>> if ((*start & PFN_4G_MASK) == 0)
>>>> (*start)++;
>>>>
>>>> if ((*end & PFN_4G_MASK) == 0)
>>>> (*end)--;
>>>>
>>>> /* Ensure start and end are in the same 4GB range */
>>>> if ((*end & ~PFN_4G_MASK) != (*start &
>>>> ~PFN_4G_MASK)) {
>>>> if ((*start & PFN_4G_MASK) + PFN_16M <
>>>> PFN_4G_MASK)
>>>> *end = (*start & ~PFN_4G_MASK) +
>>>> PFN_4G - 1;
>>>> else
>>>> *start = (*end & ~PFN_4G_MASK) + 1;
>>>>
>>>> }
>>>> }
>>>> }
>>>
>>> If you're happy with the additional restriction that a buffer can't
>>> cross a 4GB boundary (which does seem to be required for certain kinds
>>> of buffers anyway), then I don't think it needs to be anywhere near that
>>> complex:
>>>
>>> if (!(*start & PFN_4G_MASK))
>>> *start++
>>> *end = min(*end, ALIGN(*start, PFN_4G) - 1);
>>
>> What happens if *start is very near the end of a 4GB boundary? In that
>> case *end - *start might not be as big as 16MB and we could end up
>> failing the allocation IIUC.
>
> Ahem... well, the thing about complicated-and-hard-to-read code is that
> it turns out to be complicated and hard to read correctly :)
>
> FWIW, having taken a closer look, my interpretation would be something
> like (even more untested than before):
>
> u64 start_seg = ALIGN(*start, PFN_4G);
> u64 end_seg = ALIGN_DOWN(*end, PFN_4G);
>
> if (start_seg - start < end - end_seg) {
> *start = end_seg + 1;
> *end = min(*end, end_seg + PFN_4G - 1);
> } else {
> *end = start_seg - 1;
> *start = max(*start, start_seg - PFN_4G + 1);
> }
>
> but at this point I'm sure we're well into personal preference and
> "shorter does not necessarily imply clearer" territory. More
> importantly, though, it now occurs to me that the "pick the biggest end"
> approach seems inherently suboptimal for cases where the [start,end]
> interval crosses *more* than one boundary. For, say, start = PFN_4G - 1,
> end = 2 * PFN_4G + 1, either way we'd get caught up on the single page
> at one end and ignore the full 4GB in the middle :/
Indeed, that case was just occurring to me too! How about:
u64 next_seg = ALIGN(*start, PFN_4G);
if (next_seg - *start <= PFN_16M)
*start = next_seg + 1;
*end = min(*end, ALIGN(*start, PFN_4G) - 1);
So always allocate at the beginning, but skip past the next 4GB boundary
if there's less than 16MB left (or equal to avoid the 4GB boundary).
Steve
>>>>>
>>>>> But like you say: 4GB should be enough for anyone ;)
>>>>>
>>>>>>> Also a minor issue, but we might want to consider having some
>>>>>>> constants
>>>>>>> for 'color' - it's not obvious from this code that color==no_exec.
>>>>>>
>>>>>> Yeah, I was just going worry about that when we had a second bit to
>>>>>> pass.
>>>>>
>>>>> One other 'minor' issue I noticed as I was writing the above.
>>>>> PAGE_SIZE
>>>>> is of course the CPU page size - we could have the situation of CPU
>>>>> and
>>>>> GPU page size being different (e.g. CONFIG_ARM64_64K_PAGES). I'm not
>>>>> sure whether we want to support this or not (kbase doesn't). Also in
>>>>> theory some GPUs do support 64K (AS_TRANSCFG_ADRMODE_AARCH64_64K) -
>>>>> but
>>>>> kbase has never used this so I don't know if it works... ;)
>>>>
>>>> Shhh! I have thought about that, but was ignoring for now.
>>>
>>> In general, I don't think that should be too great a concern - we
>>> already handle it for IOMMUs, provided the minimum IOMMU page size is no
>>> larger than PAGE_SIZE, which should always be the case here too. Looking
>>> at how panfrost_mmu_map() is implemented, and given that we're already
>>> trying to handle stuff at 2MB granularity where possible, I don't see
>>> any obvious reasons for 64K pages not to work already (assuming there
>>> aren't any 4K assumptions baked into the userspace side). I might have
>>> to give it a try...
>>
>> I'd be interested to know if it works. I know that kbase incorrectly
>> uses PAGE_SIZE in several places (in particular assuming it is the size
>> of the page tables). And I was aware I was in danger of slipping into
>> the same mindset here.
>>
>> User space shouldn't care too much - other than the size of buffers
>> allocated being rounded up to the CPU's page size. At least the Panfrost
>> user/kernel ABI has sizes in bytes not pages (unlike kbase).
>
> Sounds promising - my Juno branch is in a bit of a mess at the moment
> but once I've got that cleaned up I'll have a quick play.
>
> Robin.
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
More information about the dri-devel
mailing list