[PATCH RFC 18/18] drm/asahi: Add the Asahi driver for Apple AGX GPUs

Asahi Lina lina at asahilina.net
Thu Apr 6 05:09:21 UTC 2023


Argh. This (and my other reply) was supposed to go to Daniel, but 
Thunderbird... just dropped that recipient? And then my silly brain saw 
all the Cc:s go to To: and figured it was some weird consolidation and 
so I moved everything to Cc: except the only name that started with "Da" 
and... yeah, that wasn't the same person.

Sorry for the confusion... I have no idea why Thunderbird hates Daniel...

On 06/04/2023 13.44, Asahi Lina wrote:
> On 05/04/2023 23.37, Daniel Vetter wrote:
>> On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote:
>>> +/// A generic monotonically incrementing ID used to uniquely identify object instances within the
>>> +/// driver.
>>> +pub(crate) struct ID(AtomicU64);
>>> +
>>> +impl ID {
>>> +    /// Create a new ID counter with a given value.
>>> +    fn new(val: u64) -> ID {
>>> +        ID(AtomicU64::new(val))
>>> +    }
>>> +
>>> +    /// Fetch the next unique ID.
>>> +    pub(crate) fn next(&self) -> u64 {
>>> +        self.0.fetch_add(1, Ordering::Relaxed)
>>> +    }
>>> +}
>>
>> Continuing the theme of me commenting on individual things, I stumbled
>> over this because I noticed that there's a lot of id based lookups where I
>> don't expect them, and started chasing.
>>
>> - For ids use xarray, not atomic counters. Yes I know dma_fence timelines
>>     gets this wrong, this goes back to an innocent time where we didn't
>>     allocate more than one timeline per engine, and no one fixed it since
>>     then. Yes u64 should be big enough for everyone :-/
>>
>> - Attaching ID spaces to drm_device is also not great. drm is full of
>>     these mistakes. Much better if their per drm_file and so private to each
>>     client.
>>
>> - They shouldn't be used for anything else than uapi id -> kernel object
>>     lookup at the beginning of ioctl code, and nowhere else. At least from
>>     skimming it seems like these are used all over the driver codebase,
>>     which does freak me out. At least on the C side that's a clear indicator
>>     for a refcount/lockin/data structure model that's not thought out at
>>     all.
>>
>> What's going on here, what do I miss?
> 
> These aren't UAPI IDs, they are driver-internal IDs (the UAPI IDs do use
> xarray and are per-File). Most of them are just for debugging, so that
> when I enable full debug spam I have some way to correlate different
> things that are happening together (this subset of interleaved log lines
> relate to the same submission). Basically just object names that are
> easier to read (and less of a security leak) than pointers and
> guaranteed not to repeat. You could get rid of most of them and it
> wouldn't affect the driver design, it just makes it very hard to see
> what's going on with debug logs ^^;
> 
> There are only two that are ever used for non-debugging purposes: the VM
> ID, and the File ID. Both are per-device global IDs attached to the VMs
> (not the UAPI VM objects, but rather the underlyng MMU address space
> managers they represent, including the kernel-internal ones) and to
> Files themselves. They are used for destroying GEM objects: since the
> objects are also device-global across multiple clients, I need a way to
> do things like "clean up all mappings for this File" or "clean up all
> mappings for this VM". There's an annoying circular reference between
> GEM objects and their mappings, which is why this is explicitly coded
> out in destroy paths instead of naturally happening via Drop semantics
> (without that cleanup code, the circular reference leaks it).
> 
> So e.g. when a File does a GEM close or explicitly asks for all mappings
> of an object to be removed, it goes out to the (possibly shared) GEM
> object and tells it to drop all mappings marked as owned by that unique
> File ID. When an explicit "unmap all in VM" op happens, it asks the GEM
> object to drop all mappings for that underlying VM ID. Similarly, when a
> UAPI VM object is dropped (in the Drop impl, so both explicitly and when
> the whole File/xarray is dropped and such), that does an explicit unmap
> of a special dummy object it owns which would otherwise leak since it is
> not tracked as a GEM object owned by that File and therefore not handled
> by GEM closing. And again along the same lines, the allocators in
> alloc.rs explicitly destroy the mappings for their backing GEM objects
> on Drop. All this is due to that annoying circular reference between VMs
> and GEM objects that I'm not sure how to fix.
> 
> Note that if I *don't* do this (or forget to do it somewhere) the
> consequence is just that we leak memory, and if you try to destroy the
> wrong IDs somehow the worst that can happen is you unmap things you
> shouldn't and fault the GPU (or, in the kernel or kernel-managed user VM
> cases, potentially the firmware). Rust safety guarantees still keep
> things from going entirely off the rails within the kernel, since
> everything that matters is reference counted (which is why these
> reference cycles are possible at all).
> 
> This all started when I was looking at the panfrost driver for
> reference. It does the same thing except it uses actual pointers to the
> owning entities instead of IDs, and pointer comparison (see
> panfrost_gem_close). Of course you could try do that in Rust too
> (literally storing and comparing raw pointers that aren't owned
> references), but then you're introducing a Pin<> requirement on those
> objects to make their addresses stable and it feels way more icky and
> error-prone than unique IDs (since addresses can be reused). panfrost
> only has a single mmu (what I call the raw VM) per File while I have an
> arbitrary number, which is why I end up with the extra
> distinction/complexity of both File and VM IDs, but the concept is the same.
> 
> Some of this is going to be refactored when I implement arbitrary VM
> range mapping/unmapping, which would be a good time to improve this...
> but is there something particularly wrong/broken about the way I'm doing
> it now that I missed? I figured unique u64 IDs would be a pretty safe
> way to identify entities and cleanup the mappings when needed.
> 
> ~~ Lina
> 

~~ Lina



More information about the dri-devel mailing list